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PREFACE (INTERNATIONAL 
EDITION) 

T he chief objective of the fourth (international) edition is to respond to the tremendous 
amount of technological progress in communication systems over the decade since the 
third edition was published. At the same time, new software and teaching tools have 
also become available, making it much easier to provide solid and illustrative examples as 
well as more experimental opportunities for students. In this new edition, major changes are 
implemented to incorporate recent technological advances of telecommunications. To captivate 
students' attention and make it easier for students to relate the course materials to their daily 
experience with communication tools, we will provide relevant information on the operation 
and features of cellular systems, wireless local area networks (LANs), and wire-line (digital 
subscriber loop or DSL) internet services, among others. 

Major Revision 

A number of critical changes are motivated by the need to emphasize the fundamentals of 
digital communication systems that have permeated our daily lives* Specifically, in light of the 
widespread applications of new technologies such as spread spectrum and orthogonal frequency 
division multiplexing (OFDM), we present a new chapter (Chapter 11) on spread spectrum 
communications and a new chapter (Chapter 12) on frequency-selective channels and OFDM 
systems* As practical examples of such systems, we provide a basic introduction of current 
wireless communication standards including cellular systems and IEEE 802.11 a/b/g/n wireless 
LAN systems* In addition, we summarize the latest DSL modem technologies and services. At 
the fundamental level, information theory and coding have also been transformed by progress 
in several important areas* In this edition, we include the basic principles of multiple-input- 
multiple-output (MIMO) technology which has begun to see broad commercial application. We 
also cover several notable breakthroughs in error correction coding, including soft decoding, 
turbo codes, and low-density parity check (LDPC) codes. 

To enhance the learning experience and to give students opportunities for computer- 
based experimental practice, relevant MATLAB examples and exercises have been provided 
in chapters that can be enhanced by these hands-on experiments. 


Organization 

The fourth (international) edition, begins with a traditional review of signal and system fun¬ 
damentals and proceeds to the core communication topics of analog modulation and digital 
pulse-coded modulation. We then present the fundamental tools of probability theory and ran¬ 
dom processes to be used in the design and analysis of digital communications in the rest of 
this text. After coverage of the fundamentals of digital communication systems, the last two 

xvii 
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chapters provide an overview of information theory and the fundamentals of forward error 
correction codes. 

Ideally, the subjects covered in this text should be taught in two courses: one on the basic 
operations of communication systems, and one on the analysis of modern communication 
systems under noise and other distortions. The former relics heavily on deterministic analyti¬ 
cal tools such as Fourier series, Fourier transforms and the sampling theorem, while the latter 
relies on tools from probability and random processes to tackle the unpredictability of message 
signals and noises. Today, however, with so many competing courses, it may he difficult to 
squeeze into a typical electrical engineering curriculum two basic courses on communications. 
Some universities do require a course in probability and random processes as a prerequisite, 
allowing both areas to be covered reasonably well in a one-semester course. This book is 
designed for adoption both as a one-semester course (in which the deterministic aspects of 
communication systems are emphasized with little consideration of the effects of noise and 
interference) and for a course that deals with both the deterministic and probabilistic aspects 
of communication systems. The book itself is self-contained, providing all the necessary back¬ 
ground in probabilities and random processes. However, as stated earlier, if both deterministic 
and probabilistic aspects of communications are to be covered in one semester, it is highly 
desirable for students to have a good background in probabilities. 

Chapter 1 introduces a panoramic view of communication systems. All the important 
concepts of communication theory are explained qualitatively in a heuristic way. This attracts 
students to communications topics in general. With this momentum, they are motivated to 
study the tool of signal analysis in Chapters 2 and 3, where they are encouraged to see a signal 
as a vector, and to think of the Fourier spectrum as a way of representing a signal in terms 
of its vector components. Chapters 4 and 5 discuss amplitude (linear) and angle (nonlinear) 
modulations respectively. Many instructors feel that in this digital age, modulation should be 
deemphasized. We hold that modulation is not so much a method of communication as a basic 
tool of signal processing; it will always be needed, not only in the area of communication 
(digital or analog), but also in many other areas of electrical engineering. Hence, neglecting 
modulation may prove to be rather shortsighted. Chapter 6, which serves as the fundamental 
link between analog and digital communications, describes the process of analog-to-digital 
conversion (ADC). It provides details of sampling, pulse code modulation (including DPCM), 
delta modulation, speech coding (vocoder), image/video coding, and compression. Chapter 7 
discusses the principles and techniques used in digital modulation. It introduces the concept of 
channel distortion and presents equalization as an effective means of distortion compensation. 

Chapters 8 and 9 provide the essential background on theories of probability and ran¬ 
dom processes. These comprise the second tool required for the study of communication 
systems. Every attempt is made to motivate students and to maintain their interest through 
these chapters by providing applications to communications problems wherever possible. 
Chapters 10 presents the analysis of digital communication systems in the presence of noise. 
It contains optimum signal detection in digital communication. Chapter 11 focuses on spread 
spectrum communications. Chapter 12 presents various practical techniques that can be used 
to combat practical channel distortions. This chapter captures both channel equalization and 
the broadly applied technology of OFDM. Chapter 13 provides a tutorial of information theory. 
Finally, the principles and key practical aspects of error control coding aie given in Chapter 14. 

One of the goals for writing this book has been to make learning a pleasant or at least a 
less intimidating experience for students by presenting the subject in a clear, understandable, 
and logically organized manner. Every effort has been made to deliver insights—rather than 
just understanding—as well as heuristic explanations of theoretical results wherever possible. 
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XIX 


Many examples are provided for further clarification of abstract results. Even partial success 
in achieving this stated goal would make all our efforts worthwhile. 

A Whole New World 

There have been a number of major technology developments since the publication of the 
third edition in 1998, First of all, the cellular telephone has deeply penetrated the daily lives of 
urban and suburban households in most developed and even developing nations. In 1998 very 
few students carried beepers and cell phones into the classroom. Now, nearly every college 
student has acelf Second, in 1998 most of the household internet connections were linked via 
low speed (28.8kbit/s) voiceband modems. Today, a majority of our students are connected 
to cyberspace through DSL or cable services. In addition, wireless LAN has made esoteric 
terms such as IEEE 802.11 into household names. Most students in the classroom have had 
experience exploring these technologies. 

Because of the vast technological advances, this new generation of students is extremely 
interested in learning about these new technologies and their implementation. The students are 
eager to understand how and where they may be able to make contributions in industry. Such 
strong motivation must be encouraged and taken advantage of. This new edition will enable 
instructors either to cover the topics themselves or to assign reading materials such that the 
students can acquire relevant information. The new edition achieves these goals by stressing 
the digital aspects of the text and by incorporating the most commonly known wire-less and 
wire-line digital technologies. 

Course Adoption 

With a combined teaching experience of over 55 years, we have taught communication classes 
under both quarter and semester systems in several major universities. In complementary 
fashion, students 1 personal experiences with communication systems have continuously been 
multiplying, from simple radio sets in the 1960s to the twenty-first century, with it easy access 
to wireless LAN, cellular devices, satellite radio, and home internet services. Hence, more 
and more students are interested in learning how familiar electronic gadgets work. With this 
important need and our past experiences in mind, we revised the fourth (international) edition 
of this text to fit well within several different curriculum configurations. In all cases, basic 
coverage should teach the fundamentals of analog and digital communications (Chapters 1-7). 

One-Semester Course (without strong probability background) 

fn many existing curricula, undergraduate students are not exposed to simple probability tools 
until they begin to take communications. Often this occurs because the students were sent to 
take an introductory statistical course that is disconnected from engineering science. This text 
is well suited to students of such a background. The first seven chapters form a comprehensive 
coverage of modem digital and analog communication systems for average ECE undergraduate 
students. Such a course can be taught in one semester (40-45 instructional hours). Under the 
premise that each student has built a solid background in Fourier analysis via a prerequisite 
class on signals and systems , most of the first three chapters can be treated as a review in one 
week. The rest of the semester can be fully devoted to teaching Chapters 4 through 7 with 
partial coverage on the practical systems of Chapters 11 and 12 to enhance students' interest. 

One-Semester Course (with a strong probability background) 

For curricula that have strengthened the background coverage of probability theory, a much 
more extensive coverage of digital communications can be achieved within one semester. 
A rigorous probability class can be taught within the context of signal and system analysis 
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(cf* George R. Cooper and Clare D. McGillem, Probabilistic Methods of Signal and System 
Analysis , Oxford University Press, 1999). For this scenario, in addition to Chapters 1 through 7, 
Chapter 10 and part of Chapter 12 on equalization can also be taught in one semester, provided 
the students have a solid probability background that can limit the coverage of Chapters 8 
and 9 to a few hours. Students completing this course would be well prepared to enter the 
telecommunications industry or to enter graduate studies. 

Two-Semester Series (without a separate probability course) 

The entire text can be thoroughly covered in two semesters for a curriculum that does not 
have any prior probability course. In other words, for a two-course series, the goal is to teach 
both communication systems and fundamentals of probabilities, in an era of many competing 
courses in the ECE curriculum, it is hard to set aside two semester courses for communications 
alone. On the other hand, most universities do have a probability course that is separately 
taught by nonengineering professors. In this scenario it would be desirable to fold probability 
theory into the two communication courses. Thus, for two semester courses, the coverage can 
be as follows; 

- 1 st semester; Chapters 1-7 {Signals and Communication Systems) 

' 2nd semester: Chapters 8-12 (Modern Digital Communication Systems) 

One-Quarter Course (with a strong probability background) 

Tn a quarter system, students must have prior exposure to probability and statistics at a rigorous 
level (cf* Cooper and McGillem, Probabilistic Methods of Signal and System Analysis), They 
must also have solid knowledge of Fourier analysis. Within a quarter, the class can impart the 
basics of analog and digital communication systems (Chapters 3-7), and, in chapters 10 and 11, 
respectively, analysis of digital communication systems and spread spectrum communications. 

One-Quarter Course (without a strong probability background) 

In the rare case that students come in without much background in probability, it is important 
for them to acquire basic knowledge of communication systems. It is wise not to attempt to 
analyze digital communication systems. Instead, basic coverage without prior knowledge of 
probability can be achieved by teaching the operations of analog and digital systems (Chapters 
1-7) and providing a high-level discussion of spread spectrum wireless systems (Chapter 11). 

Two-Quarter Series (with basic probability background) 

Unlike a one-quarter course, a two-quarter series can be well designed to teach most of the 
important materials on communication systems and their analysis. The entire text can be 
extensively taught in two quarters for a curriculum that has some preliminary coverage of 
Fourier analysis and probabilities* Essentially viewing Chapters 1 through 3 and Chapter 8 as 
partly new and partly reviews, the coverage can be as follows: 

* 1st quarter: Chapters 1-9 (Communication Systems and Analysis) 

* 2nd quarter: Chapters 10-14 (Digital Communication Systems) 

MATLAB and Laboratory Experience 

Since many universities no longer have hardware communication laboratories, MATLAB- 
based communication system exercises are included to enhance the learning experience* 
Students will be able to design systems and modify their parameters to evaluate the over¬ 
all effects on the performance of communication systems through computer displays and the 
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measurement of bit error rates. Students will acquire first-hand knowledge on how to design 
and perform simulations of communication systems. 
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INTRODUCTION 




O ver the past decade, the rapid expansion of digital communication technologies has 
been simply astounding. Internet, a word and concept once familiar only to technolo¬ 
gists and the scientific community, has permeated every aspect of people’s daily lives. 
It is quite difficult to find any individual in a modem society that has not been touched by new 
communication technologies ranging from cellular phones to Bluetooth. This book examines 
the basic principles of communication by electric signals. Before modem times, messages 
were carried by runners, carrier pigeons, lights, and fires. These schemes were adequate for the 
distances and “data rates” of the age. In most parts of the world, these modes of communication 
have been superseded by electrical communication systems,* which can transmit signals over 
much longer distances (even to distant planets and galaxies) and at the speed of light. 

Electrical communication is dependable and economical; communication technologies 
improve productivity and energy conservation. Increasingly, business meetings are conducted 
through teleconferences, saving the time and energy formerly expended on travel. Ubiqui¬ 
tous communication allows real-time management and coordination of project participants 
from around the globe. E-mail is rapidly replacing the more costly and slower “snail mails ” 
E-commerce has also drastically reduced some costs and delays associated with marketing, 
while customers are also much better informed about new products and product information. 
Traditional media outlets such as television, radio, and newspapers have been rapidly evolving 
in the past few years to cope with, and better utilize, the new communication and networking 
technologies. The goal of this textbook is to provide the fundamental technical knowledge 
needed by next-generation communication engineers and technologists for designing even 
better communication systems of the future. 


1.1 COMMUNICATION SYSTEMS 

Figure 1.1 presents three typical communication systems; a wire-line telephone-cellular phone 
connection, a TV broadcasting system, and a wireless computer network. Because of the 
numerous examples of communication systems in existence, it would be unwise to attempt 
to study the details of all kinds of communication systems in this book. Instead, the most 
efficient and effective way to learn about communication is by studying the major func¬ 
tional blocks common to practically all communication systems. This way, students are not 


* With the exception of the postal service. 
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merely learning the operations of those existing systems they have studied; More impor¬ 
tantly, they can acquire the basic knowledge needed to design and analyze new systems never 
encountered in a textbook. To begin, it is essential to establish a typical communication sys¬ 
tem model as shown in Fig. 1.2. The key components of a communication system are as 
follows. 

The source originates a message, such as a human voice, a television picture, an e-mail 
message, or data. If the data is nonelectric (e.g., human voice, e-mail text, television video), 
it must be converted by an input transducer into an electric waveform referred to as the 
baseband signal or message signal through physical devices such as a microphone, a computer 
keyboard, or a CCD camera. 

The transmitter modifies the baseband signal for efficient transmission. The transmitter 
may consist of one or more subsystems: an A/D converter, an encoder, and a modulator. 
Similarly, the receiver may consist of a demodulator, a decoder, and a D/A converter. 
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Figure 1.2 

Communication 

system. 


Input 

signal 



The channel is a medium of choice that can convey the electric signals at the transmitter 
output over a distance, A typical channel can be a pair of twisted copper wires (telephone and 
DSL), coaxial cable (television and internet), ail optical fiber, or a radio link. Additionally, a 
channel can also be a point-to-point connection in a mesh of interconnected channels that form 
a communication network. 

The receiver reprocesses the signal received from the channel by reversing the signal 
modifications made at the transmitter and removing the distortions made by the channel. The 
receiver output is fed to the output transducer, which converts the electric signal to its original 
form—the message. 

The destination is the unit to which the message is communicated. 

A channel is a physical medium that behaves partly like a filter that generally attenuates 
the signal and distorts the transmitted waveforms. The signal attenuation increases with the 
length of the channel, varying from a few percent for short distances to orders of magni¬ 
tude in interplanetary communications. Signal waveforms are distorted because of physical 
phenomena such as frequency-dependent gains, multipath effects, and Doppler shift. For 
example, a frequency-selective channel causes different amounts of attenuation and phase 
shift to different frequency components of the signal. A square pulse is rounded or “spread 
out” during transmission over a low-pass channel. These types of distortion, called linear 
distortion, can be partly corrected at the receiver by an equalizer with gain and phase 
characteristics complementary to those of the channel. Channels may also cause nonlin¬ 
ear distortion through attenuation that varies with the signal amplitude. Such distortions 
can also be partly corrected by a complementary equalizer at the receiver Channel distor¬ 
tions, if known, can also be precompensated by transmitters by applying channel-dependent 
predistortions. 

In a practical environment, signals passing through communication channels not only 
experience channel distortions but also are corrupted along the path by undesirable inter¬ 
ferences and disturbances lumped under the broad term noise. These interfering signals are 
random and are unpredictable from sources both external and internal. External noise includes 
interference signals transmitted on nearby channels, human-made noise generated by faulty 
contact switches of electrical equipment, automobile ignition radiation, fluorescent lights or 
natural noise from lightning, microwave ovens, and cellphone emissions, as well as elec¬ 
tric storms and solar and intergalactic radiation. With proper care in system design, external 
noise can be minimized or even eliminated in some cases. Internal noise results from thermal 
motion of charged particles in conductors, random emission, and diffusion or recombina¬ 
tion of charged carriers in electronic devices. Proper care can reduce the effect of internal 
noise but can never eliminate it. Noise is one of the underlying factors that limit the rate of 
telecomm un icati on s. 

Thus in practical communication systems, the channel distorts the signal, and noise accu¬ 
mulates along the path. Worse yet, the signal strength decreases while the noise level remains 
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steady regardless of the distance from the transmitter. Thus, the signal quality is continuously 
worsening along the length of the channel. Amplification of the received signal to make up for 
the attenuation is to no avail because the noise will be amplified by the same proportion, and 
the quality remains, at best, unchanged.* These are the key challenges that we must face in 
designing modern communication systems. 

1.2 ANALOG AND DIGITAL MESSAGES 

Messages are digital or analog. Digital messages are ordered combinations of finite symbols or 
codewords* For example, printed English consists of 26 letters, 10 numbers, a space, and several 
punctuation marks* Thus, a text document written in English is a digital message constructed 
from the ASCII keyboard of 128 symbols. Human speech is also a digital message, because it is 
made up from a finite vocabulary in a language.^ Music notes are also digital, even though the 
music sound itself is analog. Similarly, a Morse-coded telegraph message is a digital message 
constructed from a set of only two symbols—dash and dot* It is therefore a binary message, 
implying only two symbols. A digital message constructed with M symbols is called an M -ary 
message. 

Analog messages, on the other hand, are characterized by data whose values vary over a 
continuous range and are defined for a continuous range of time* For example, the temperature 
or the atmospheric pressure of a certain location over time can vary over acontinuous range and 
can assume an (uncountable) infinite number of possible values, A piece of music recorded by 
a pianist is also an analog signal. Similarly, a particular speech waveform has amplitudes that 
vary over a continuous range. Over agiven time interval, an infinite number of possible different 
speech waveforms exist, in contrast to only a finite number of possible digital messages. 

1.2.1 Noise Immunity of Digital Signals 

It is no secret to even a casual observer that every time one looks at the latest electronic 
communication products, newer and better "digital technology” is replacing the old analog 
technology. Within the past decade, cellular phones have completed their transformation from 
the first-generation analog AMPS to the current second-generation (e.g., GSM, CDMA) and 
third-generation (e.g*, WCDMA) digital offspring* More visibly in every household, digital 
video technology (DVD) has made the analog VHS cassette systems almost obsolete. Digital 
television continues the digital assault on analog video technology by driving out the last 
analog holdout of color television* There is every reason to ask: Why are digital technologies 
better? The answer has to do with both economics and quality. The case for economics is 
made by noting the ease of adopting versatile, powerful, and inexpensive high-speed digital 
microprocessors* But more importantly at the quality level, one prominent feature of digital 
communications is the enhanced immunity of digital signals to noise and interferences. 

Digital messages are transmitted as a finite set of electrical waveforms. In other words, 
a digital message is generated from a finite alphabet, while each character in the alphabet 
can be represented by one waveform or a sequential combination of such waveforms. For 
example, in sending messages via Morse code, a dash can be transmitted by an electri¬ 
cal pulse of amplitude A/2 and a dot can be transmitted by a pulse of negative amplitude 


* Actually, amplification may further deteriorate the signal because of additional amplifier noise. 
f Here we imply the information contained in the speech rather than its details such as the pronunciation of words 
and varying inflections, pitch, and emphasis. The speech signal from a microphone contains all these details and is 
therefore an analog signal, and its information content is more than a thousand times greater than the information 
accessible from the written text of the same speech. 
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Figure 1.3 
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—A/2 (Fig 1,3a), In an M -ary case, M distinct electrical pulses (or waveforms) are used; 
each of the M pulses represents one of the M possible symbols* Once transmitted, the 
receiver must extract the message from a distorted and noisy signal at the channel output. 
Message extraction is otten easier from digital signals than from analog signals because 
the digital decision must belong to the finite-sized alphabet. Consider a binary case; two 
symbols are encoded as rectangular pulses of amplitudes A/2 and -A/2. The only deci¬ 
sion at the receiver is to select between two possible pulses received; the tine details of 
the pulse shape are not an issue. A finite alphabet leads to noise and interference immu¬ 
nity* The receiver's decision can be made with reasonable certainty even if the pulses 
have suffered modest distortion and noise (Fig. 1.3). The digital message in Fig. 1.3a is dis¬ 
torted by the channel, as shown in Fig, l*3b* Yet, if the distortion is not too large, we can 
recover the data without error because we need make only a simple binary decision: Is the 
received pulse positive or negative? Figure 1.3c shows the same data with channel distortion 
and noise. Here again, the data can be recovered correctly as long as the distortion and the 
noise are within limits* In contrast, the waveform shape itself in an analog message carries the 
needed information, and even a slight distortion or interference in the waveform will show up 
in the received signal. Clearly, a digital communication system is more rugged than an analog 
communication system in the sense that it can better withstand noise and distortion (as long 
as they are within a limit)* 


1.2.2 Viability of Distortionless Regenerative Repeaters 

One main reason for the superior quality of digital systems over analog ones is the viability 
of regenerative repeaters and network nodes in the former. Repeater stations are placed along 
the communication path of a digital system at distances short enough to ensure that noise 
and distortion remain within a limit. This allows pulse detection with high accuracy. At each 
repeater station, or network node, the incoming pulses are detected such that new, “clean” pulses 
are retransmitted to the next repeater station or node. This process prevents the accumulation 
of noise and distortion along the path by cleaning the pulses at regular repeater intervals. 
We can thus transmit messages over longer distances with greater accuracy. There has been 
widespread application of distortionless regeneration by repeaters in long-haul communication 
systems and by nodes in a large (possibly heterogeneous) network. 

For analog systems, signals and noise within the same bandwidth cannot be separated* 
Repeaters in analog systems are basically filters plus amplifiers and are not “regenerative*” 









6 INTRODUCTION 


Thus, it is impossible to avoid in-band accumulation of noise and distortion along the path. 
As a result, the distortion and the noise interference can accumulate over the entire transmis¬ 
sion path as a signal traverses through the network. To compound the problem, the signal is 
attenuated continuously over the transmission path. Thus, with increasing distance the signal 
becomes weaker, whereas the distortion and the noise accumulate more. Ultimately, the signal, 
overwhelmed by the distortion and noise, is buried beyond recognition. Amplification is of 
little help, since it enhances both the signal and the noise equally. Consequently, the distance 
over which an analog message can be successfully received is limited by the first transmitter 
power. Despite these limitations, analog communication was used widely and successfully in 
the past for short- to medium-range communications. Nowadays, because of the advent of 
optical fiber communications and the dramatic cost reduction achieved in the fabrication of 
high-speed digital circuitry and digital storage devices, almost all new communication sys¬ 
tems being installed are digital. But some old analog communication facilities are still in use, 
including those for AM and FM radio broadcasting. 


1,2.3 Analog-to-Digital (A/D) Conversion 

Despite the differences between analog and digital signals, a meeting ground exists between 
them; conversion of analog signals to digital signals (A/D conversion). A key device in 
electronics, the analog-to-digital (A/D) converter, enables digital communication systems to 
convey analog source signals such as audio and video. Generally, analog signals are continuous 
in time and in range; that is, they have values at every time instant, and their values can be any¬ 
thing within the range. On the other hand, digital signals exist only at discrete points of time, 
and they can take on only finite values. A/D conversion can never be 100% accurate. Since, 
however, human perception does not require infinite accuracy, A/D conversion can effectively 
capture necessary information from the analog source for digital signal transmission. 

Two steps take place in A/D conversion: a continuous time signal is first sampled into a 
discrete time signal, whose continuous amplitude is then quantized into a discrete level signal. 
First, the frequency spectrum of a signal indicates relative magnitudes of various frequency 
components. The sampling theorem (Chapter 6) states that if the highest frequency in the 
signal spectrum is B (in hertz), the signal can be reconstructed from its discrete samples, 
taken uniformly at a rate not less than 2 B samples per second. This means that to preserve 
the information from a continuous-time signal, we need transmit only its samples (Fig. 1.4). 


Figure 1,4 
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However, the sample values are still not digital because they lie in a continuous dynamic 
range. Here, the second step of quantization comes to rescue. Through quantization, each 
sample is approximated, or “rounded off ” to the nearest quantized level, as shown in Fig. 1.4. 
As human perception has only limited accuracy, quantization with sufficient granularity does 
not compromise the signal quality. It amplitudes of the message signal m(t) lie in the range 
(~ m p* m p)* the quantizer partitions the signal range into L intervals. Each sample amplitude 
is approximated by the midpoint of the interval in which the sample value falls. Each sam¬ 
ple is now' represented by one of the L numbers. The information is thus digitized. Hence, 
after the two steps of sampling and quantizing, the analog-to-digital (A/D) conversion is 
completed. 

The quantized signal is an approximation of the original one. We can improve the accu¬ 
racy of the quantized signal to any desired level by increasing the number of levels L. 
For intelligibility of voice signals, for example, E = 8 or 16 is sufficient. For commercial 
use, L -32 is a minimum, and for telephone communication, L— 128 or 256 is commonly 
used. 

Atypical distorted binary signal with noise acquired over the channel is shown in Fig. 1.3. If 
A is sufficiently large in comparison to typical noise amplitudes, the receiver can still correctly 
distinguish between the two pulses. The pulse amplitude is typically 5 to 10 times therms noise 
amplitude. For such a high signal-to-noise ratio (SNR) the probability of error at the receiver 
is less than 10 that is, on the average, the receiver will make fewer than one error per 
million pulses. The effect of random channel noise and distortion is thus practically eliminated. 
Hence, when analog signals are transmitted by digital means, some error, or uncertainty, in the 
received signal can be caused by quantization, in addition to channel noise and interferences. 
By increasing E, we can reduce to any desired amount the unccitainty, or error, caused by 
quantization. At the same time, because of the use of regenerative repeaters, we can transmit 
signals over a much longer distance than would have been possible for the analog signal. As 
will be seen later in this text, the price for all these benefits of digital communication is paid 
in terms of increased processing complexity and bandwidth of transmission. 


1.2.4 Pulse-Coded Modulation—A Digital Representation 

Once the A/D conversion is over, the original analog message is represented by a sequence 
of samples, each of which takes on one of the L preset quantization levels. The transmission 
of this quantized sequence is the task of digital communication systems. For this reason, 
signal waveforms must be used to represent the quantized sample sequence in the transmission 
process. Similarly, a digital storage device also would need to represent the samples as signal 
waveforms. Pulse-coded modulation (PCM) is a very simple and yet common mechanism for 
this purpose. 

First, one information bit refers to one binary digit of 1 or 0. The idea of PCM is to represent 
each quantized sample by an ordered combination of two basic pulses: p\(t) representing 1 and 
Po(t) representing 0. Because each of the l possible sample values can be written as a bit string 
of length log 2 E, each sample can therefore also be mapped into a short pulse sequence that 
represents the binary sequence of bits. For example, if E = 16, then, each quantized level can 
be described uniquely by 4 bits. If we use two basic pulses, p\ (t) = A/2 and p$(t) — -A/2. A 
sequence of four such pulses gives 2x2x2x2 = 16 distinct patterns, as shown in Fig. 1.5, 
We can assign one pattern to each of the 16 quantized values to be transmitted. Each quantized 
sample is now coded into a sequence of four binary pulses. This is the principle of PCM 
transmission, where signaling is carried out by means of only two basic pulses (or symbols). 
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Figure 1+5 
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The binary case is of great practical importance because of its simplicity and ease of detection. 
Much of today's digital communication is binary,* 

Although PCM was invented by R M. Rainey in 1926 and rediscovered by A, H, Reeves in 
1939, it was not until the early 1960s that the Bell System installed the first communication link 
using PCM for digital voice transmission. The cost and size of vacuum tube circuits were the 
chief impediments to the use of PCM in the early days before the discovery of semiconductor 
devices. It was the transistor that made PCM practicable. 

From all these discussions on PCM, we arrive at a rather interesting (and to certain extent 
not obvious) conclusion—that every possible communication can be carried on with a mini¬ 
mum of two symbols. Thus, merely by using a proper sequence of a wink of the eye, one can 
convey any message, be it a conversation, a book, a movie, or an opera. Every possible detail 
(such as various shades of colors of the objects and tones of the voice, etc.) that is reproducible 
on a movie screen or on the high-definition color television can be conveyed with no less 
accuracy, merely by winks of an eye.* 


* An intermediate case exists where we use four basic pulses (quaternary pulses) of amplitudes ±A/2 and ±3,4/2. A 
sequence of two quaternary pulses can form 4 x 4 = 16 distinct levels of values. 

* Of course, to convey the information in a movie or a television program in real time, the winking would have to be 
at an inhumanly high speed. For example, the HDTV signal is represented by 19 million bits (winks) per second. 
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1.3 CHANNEL EFFECT, SIGNAL-TO-NOISE RATIO, 

AND CAPACITY 


In designing communication systems, it is important to understand and analyze important fac¬ 
tors such as the channel and signal characteristics, the relative noise strength, the maximum 
number of bits that can be sent over a channel per second, and, ultimately, the signal 
quality. 


1.3.1 Signal Bandwidth and Power 

In a given (digital) communication system, the fundamental parameters and physical limita¬ 
tions that control the rate and quality are the channel bandwidth B and the signal power Ps* 
Their precise and quantitative relationships will be discussed in later chapters. Here we shall 
demonstrate these relationships qualitatively. 

The bandwidth of a channel is the range of frequencies that it can transmit with reasonable 
fidelity. For example, if a channel can transmit with reasonable fidelity a signal whose frequency 
components vary from 0 Hz (dc) up to a maximum of 5000 Hz (5 kHz), the channel bandwidth 
B is 5 kHz. Likewise, each signal also has a bandwidth that measures the maximum range of 
its frequency components. 

The faster a signal changes, the higher its maximum frequency is, and the larger its 
bandwidth is* Signals rich in content that changes quickly (such as those for battle scenes in 
a video) have larger bandwidth than signals that are dull and vary slowly (such as those for a 
daytime soap opera or a video of sleeping animals)* A signal can be successfully sent over a 
channel if the channel bandwidth exceeds the signal bandwidth. 

To understand the role of B , consider the possibility of increasing the speed of information 
transmission by compressing the signal in time. Compressing a signal in time by a factor 
of 2 allows it to be transmitted in half the time, and the transmission speed (rate) doubles. 
Time compression by a factor of 2, however, causes the signal to “wiggle” twice as fast, 
implying that the frequencies of its components are doubled. Many people have had firsthand 
experience of this effect when playing a piece of audiotape twice as fast, making the voices of 
normal people sound like the high-pitched speech of cartoon characters. Now, to transmit this 
compressed signal without distortion, the channel bandwidth must also be doubled. Thus, the 
rate of information transmission that a channel can successfully carry is directly proportional 
to B. More generally, if a channel of bandwidth B can transmit /V pulses per second, then 
to transmit KN pulses per second by means of the same technology, we need a channel of 
bandwidth KB. To reiterate, the number of pulses per second that can be transmitted over a 
channel is directly proportional to its bandwidth B. 

The signal power P s plays a dual role in information transmission* First, P s is related to 
the quality of transmission. Increasing P s strengthens the signal pulse and diminishes the effect 
of channel noise and interference. In fact, the quality of either analog or digital communication 
systems varies with the signal-to-noise ratio (SNR). In any event, a certain minimum SNR at 
the receiver is necessary for successful communication. Thus, a larger signal power P s allows 
the system to maintain a minimum SNR over a longer distance, thereby enabling successful 
communication over a longer span. 

The second role of the signal power is less obvious, although equally important* From 
the information theory point of view, the channel bandwidth B and the signal power P s are, 
to some extent, exchangeable; that is, to maintain a given rate and accuracy of information 
transmission, we can trade P s for £, and vice versa. Thus, one may use less B if one is willing 



10 


INTRODUCTION 


to increase P x , or one may reduce P s if one is given bigger B. The rigorous proof of this will 
be provided in Chapter 13. 

In short, the two primary communication resources are the bandwidth and the transmitted 
power. In a given communication channel, one resource may be more valuable than the other, 
and the communication scheme should be designed accordingly. A typical telephone channel, 
for example, has a limited bandwidth (3 kHz), but the power is less restrictive. On the other 
hand, in space vehicles, huge bandwidth is available but the power is severely limited. Hence, 
the communication solutions in the two cases are radically different. 


1.3.2 Channel Capacity and Data Rate 

Channel bandwidth limits the bandwidth of signals that can successfully pass through, whereas 
signal SNR at the receiver determines the recoverability of the transmitted signals. Higher SNR 
means that the transmitted signal pulse can use more signal levels, thereby carrying more bits 
with each pulse transmission. Higher bandwidth B also means that one can transmit more 
pulses (faster variation) over the channel. Hence, SNR and bandwidth B can both affect the 
underlying channel “throughput.” The peak throughput that can be reliably carried by a channel 
is defined as the channel capacity. 

One of the most commonly encountered channels is known as the additive white Gaussian 
noise (AWGN) channel. The AWGN channel model assumes no channel distortions except 
for the additive white Gaussian noise and its linite bandwidth B. This ideal model captures 
application cases with distortionless channels and provides a performance upper bound for 
more general distortive channels. The band-limited AWGN channel capacity was dramatically 
highlighted by Shannon’s equation, 

C = B log 2 (l -h SNR) bit/s (1.1) 

Here the channel capacity C is the upper bound on the rate of information transmission per 
second. In other words, C is the maximum number of bits that can be transmitted per second 
with a probability of error arbitrarily close to zero; that is, the transmission is as accurate as one 
desires. The capacity only points out this possibility, however; it does not specify how it is to be 
realized. Moreover, it is impossible to transmit at a rate higher than this without incurring errors. 
Shannon’s equation clearly brings out the limitation on the rate of communication imposed by B 
and SNR. if there is no noise on the channel (assuming SNR = oo), then the capacity C would 
be oo, and communication rate could be arbitrarily high. We could then transmit any amount of 
information in the world over one noiseless channel. This can be readily verified. If noise were 
zero, there would be no uncertainty in the received pulse amplitude, and the receiver would 
be able to detect any pulse amplitude without error. The minimum pulse amplitude separation 
can be arbitrarily small, and for any given pulse, we have an infinite number of fine levels 
available. We can assign one level to every possible message. Because an infinite number of 
levels are available, it is possible to assign one level to any conceivable message. Cataloging 
such a code may not be practical, but that is beside the point. Rather, the point is that if the 
noise is zero, communication ceases to be a problem, at least theoretically. Implementation 
of such a scheme would be difficult because of the requirement of generation and detection 
of pulses of precise amplitudes. Such practical difficulties would then set a limit on the rate of 
communication. It should be remembered that Shannon’s result, which represents the upper 
limit on the rate of communication over a channel, would be achievable only with a system of 
monstrous and impractical complexity, and with a time delay in reception approaching infinity. 
Practical systems operate at rates below the Shannon rate. 
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In conclusion, Shannon's capacity equation demonstrates qualitatively the basic role 
played by B and SNR in limiting the performance of a communication system. These two 
parameters then represent the ultimate limitation on the rate of communication. The possi¬ 
bility of resource exchange between these two basic parameters is also demonstrated by the 
Shannon equation. 

As a practical example of trading SNR for bandwidth if, consider the scenario in which we 
meet a soft-spoken man who speaks a little bit too fast for us to fully understand. This means 
that as listeners, our bandwidth B is too low and therefore, the capacity C is not high enough to 
accommodate the rapidly spoken sentences. However, if the man can speak louder (increasing 
power and hence the SNR), we are likely to understand him much better without changing 
anything else. This example illustrates the concept of resource exchange between SNR and 
B. Note, however, that this is not a one-to-one trade. Doubling the speaker volume allows the 
speaker to talk a little faster, but not twice as fast. This unequal trade effect is fully captured by 
Shannon's equation [Eq. (1.1)], where doubling the SNR cannot always compensate the loss 
of B by 50%. 


1.4 MODULATION AND DETECTION 

Analog signals generated by the message sources or digital signals generated through A/D 
conversion of analog signals are often referred to as baseband signals because they typically 
are low pass in nature. Baseband signals may be directly transmitted over a suitable channel 
(e,g>, telephone, fax). However, depending on the channel and signal frequency domain char¬ 
acteristics, baseband signals produced by various information sources are not always suitable 
for direct transmission over a given channel When signal and channel frequency bands do 
not match exactly, channels cannot be moved. Hence, messages must be moved to the right 
channel frequency bandwidth. Message signals must therefore be further modified to facilitate 
transmission. In this conversion process, known as modulation, the baseband signal is used 
to modify (le., modulate), some parameter of a radio-frequency (RF) carrier signal 

A carrier is a sinusoid of high frequency. Through modulation, one of the carrier sinusoidal 
parameters—such as amplitude, frequency, or phase—is varied in proportion to the baseband 
signal m(t ). Accordingly, we have amplitude modulation (AM), frequency modulation (FM), 
or phase modulation (PM). Figure 1.6 shows a baseband signal m{t) and the corresponding 
AM and FM waveforms. In AM, the carrier amplitude varies in proportion to m{t ), and in 
FM, the carrier frequency varies in proportion m(t). To reconstruct the baseband signal at the 
receiver, the modulated signal must pass through a reversal process called demodulation. 

As mentioned earlier, modulation is used to facilitate transmission. Some of the important 
reasons for modulation are given next. 

1.4.1 Ease of Radiation/Transmission 

For efficient radiation of electromagnetic energy, the radiating antenna should be on the order 
of a fraction or more of the wavelength of the driving signal. For many baseband signals, the 
wavelengths are too large for reasonable antenna dimensions. For example, the power in a 
speech signal is concentrated at frequencies in the range of 100 to 3000 Hz. The corresponding 
wavelength is 100 to 3000 km. This long wavelength would necessitate an unpractically large 
antenna. Instead, by modulating a high-frequency carrier, we effectively translate the signal 
spectrum to the neighborhood of the carrier frequency that corresponds to a much smaller 
wavelength. For example, a 10 MHz carrier has a wavelength of only 30 m, and its transmission 
can be achieved with an antenna size on the order of 3 m. In this respect, modulation is like 
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Figure 1.6 
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letting the baseband signal hitch a ride on a high-frequency sinusoid (carrier). The carrier and 
the baseband signal may also be compared to a stone and a piece of paper. If we wish to throw 
a piece of paper, it cannot go too far by itself. But if it is wrapped around a stone (a carrier), it 
can be thrown over a longer distance. 


1.4.2 Simultaneous Transmission of Multiple 
Signals—Multiplexing 

Modulation also allows multiple signals to be transmitted at the same time in the same geo¬ 
graphical area without direct mutual interference. This case in point is simply demonstrated 
by considering the output of multiple television stations carried by the same cable (or over 
the air) to people's television receivers. Without modulation, multiple video signals will all 
be interfering with one another because all baseband video signals effectively have the same 
bandwidth. Thus, cable TV or broadcast TV without modulation would be limited to one sta¬ 
tion at a time in a given location—a highly wasteful protocol because the channel bandwidth 
is many times larger than that of the signal. 

One way to solve this problem is to use modulation. We can use various TV stations to 
modulate different carrier frequencies, thus translating each signal to a different frequency 
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range. If the various carriers are chosen sufficiently far apart in frequency, the spectra of the 
modulated signals (known as TV channels) will not overlap and thus will not interfere with 
each other. At the receiver (TV set), a tunable bandpass filter can select the desired station 
or TV channel for viewing. This method of transmitting several signals simultaneously, over 
nonoverlapping frequency bands, is known as frequency division multiplexing (FDM). A 
similar approach is also used in AM and FM radio broadcasting. Here the bandwidth of the 
channel is shared by various signals without any overlapping. 

Another method of multiplexing several signals is known as time division multiplexing 
(TDM). This method is suitable when a signal is in the form of a pulse train (as in PCM). 
When the pulses are made narrower, the spaces left between pulses of one user signal are used 
for pulses from other signals. Thus, in effect, the transmission time is shared by a number of 
signals by interleaving the pulse trains of various signals in a specified order. At the receiver, 
the pulse trains corresponding to various signals are separated. 

1.4.3 Demodulation 

Once multiple modulated signals have arrived at the receiver, the desired signal must be 
detected and recovered into its original baseband form. Note that because of FDM, the first 
stage of a demodulator typically requires a tunable bandpass filter so that the receiver can select 
the modulated signal at a predetermined frequency band specified by the transmission station 
or channel. Once a particular modulated signal has been isolated, the demodulator will then 
need to convert the carrier variation of amplitude, frequency, or phase, back into the baseband 
signal voltage. 

For the three basic modulation schemes of AM, FM, and PM, the corresponding demod¬ 
ulators must be designed such that the detector output voltage varies in proportion to the input 
modulated signal's amplitude, frequency, and phase, respectively. Once circuits with such 
response characteristics have been implemented, the demodulators can downconvert the mod¬ 
ulated (RF) signals back into the baseband signals that represent the original source message, 
be it audio, video, or data. 


1.5 DIGITAL SOURCE CODING AND ERROR 
CORRECTION CODING 

As stated earlier, SNR and bandwidth are two factors that determine the performance of a given 
communication. Unlike analog communication systems, digital systems often adopt aggressive 
measures to lower the source data rate and to fight against channel noise. In particular, soutve 
coding is applied to generate the fewest bits possible for a given message without sacrificing its 
detection accuracy. On the other hand, to combat errors that arise from noise and interferences, 
redundancy needs to be introduced systematically at the transmitter, such that the receivers can 
rely on the redundancy to correct errors caused by channel distortion and noise. This process 
is known as error correction coding by the transmitter and decoding by the receiver. 

Source coding and error correction coding are two successive stages in a digital com¬ 
munication system that work in a see-saw battle. On one hand, the job of source coding is 
to remove as much redundancy from the message as possible to shorten the digital message 
sequence that requires transmission. Source coding aims to use as little bandwidth as possible 
without considering channel noise and interference. On the other hand, error correction coding 
intentionally introduces redundancy intelligently, such that if errors occur upon detection, the 
redundancy can help correct the most likely errors. 
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Randomness, Redundancy, and Source Coding 

To understand source coding, it is important to first discuss the role of randomness in communi¬ 
cations. As noted earlier, channel noise is a major factor limiting communication performance 
because it is random and cannot be removed by prediction. On other other hand, randomness is 
also closely associated with the desired signals in communications. Indeed, randomness is the 
essence of communication. Randomness means unpredictability, or uncertainty, of a source 
message. If a source had no unpredictability, like a friend who always wants to repeat the same 
story on “how I was abducted by an alien,” then the information would be known beforehand 
and would contain no information. Similarly, if a person winks, it conveys some information 
in a given context But if a person winks continuously with the regularity of a clock, the winks 
convey no information. In short, a predictable signal is not random and is fully redundant. 
Thus, a message contains information only if it is unpredictable. Higher predictability means 
higher redundancy and, consequently, less information. Conversely, more unpredictable or less 
likely random signals contain more information. 

Source coding reduces redundancy based on the predictability of the message source. The 
objective of source coding is to use codes that are as short as possible to represent the source 
signal. Shorter codes are more efficient because they require less time to transmit at a given 
data rate. Hence, source coding should remove signal redundancy while encoding and trans¬ 
mitting the unpredictable, random part of the signal. The more predictable messages contain 
more redundancy and require shorter codes, while messages that are less likely contain more 
information and should be encoded with longer codes. By assigning more likely messages with 
shorter source codes and less likely messages with longer source codes, one obtains more effi¬ 
cient source coding. Consider the Morse code, for example. Tn this code, various combinations 
of dashes and dots (code words) are assigned to each letter. To minimize transmission Lime, 
shorter code words are assigned to more frequently occurring (more probable) letters (such 
as c, f, and a) and longer code words are assigned to rarely occurring (less probable) letters 
(such as x, q, and z). Thus, on average, messages in English would tend to follow a known 
letter distribution, thereby leading to shorter code sequences that can be quickly transmitted. 
This explains why Morse code is a good source code. 

It will be shown in Chapter 13 that for digital signals, the overall transmission time is 
minimized if a message (or symbol) of probability P is assigned a code word with a length 
proportional to log(l/P). Hence, from an engineering point of view, the information of a 
message with probability P is proportional to log (1 fP). This is known as entropy (source) 
coding. 

Error Correction Coding 

Error correction coding also plays an important role in communication. While source coding 
removes redundancy, error correction codes add redundancy. The systematic introduction of 
redundancy supports reliable communication, 4 Because of redundancy, if certain bits are in 
error due to noise or interference, other related bits may help them recover, allowing us to 
decode a message accurately despite errors in the received signal. All languages are redundant. 
For example, English is about 50% redundant; that is, on the average, we may throw' out half 
the letters or words without losing the meaning of a given message. This also means that in 
any English message, the speaker or the writer has free choice over half the letters or words, 
on the average. The remaining half is determined by the statistical structure of the language. 
If all the redundancy of English were removed, it would take about half the time to transmit 
a telegram or telephone conversation. If an error occurred at the receiver, however, it would 
be rather difficult to make sense out of the received message. The redundancy in a message, 
therefore, plays a useful role in combating channel noises and interferences. 
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It may appear paradoxical that in source coding we would remove redundancy, only to add 
more redundancy at the subsequent error correction coding. To explain why this is sensible, 
consider the removal of all redundancy in English through source coding. This would shorten 
the message by 50% (for bandwidth saving). However, for error correction, we may restore 
some systematic redundancy, except that this well-designed redundancy is only half as long as 
what was removed by source coding white still providing the same amount of error protection. 
It is therefore clear that a good combination of source coding and error correction coding 
can remove inefficient redundancy without sacrificing error correction. In fact, a very popular 
problem in this field is the persistent pursuit of joint source-channel coding that can maximally 
remove signal redundancy without losing error correction. 

How redundancy can enable error correction can be seen with an example: to transmit 
samples with L = 16 quantizing levels, we may use a group of four binary pulses, as shown 
in Fig. 1.5. Tn this coding scheme, no redundancy exists. If an error occurs in the reception of 
even one of the pulses, the receiver will produce a wrong value. Here we may use redundancy 
to eliminate the effect of possible errors caused by channel noise or imperfections. Thus, if we 
add to each code word one more pulse of such polarity as to make the number of positive pulses 
even, we have a code that can detect a single error in any place. Thus, to the code word 0001 
we add a fifth pulse, of positive polarity, to make anew code word, 00011. Now the number of 
positive pulses is 2 (even). If a single error occurs in any position, this parity will be violated. 
The receiver knows that an error has been made and can request retransmission of the message. 
This is a very simple coding scheme. It can only detect an error; it cannot locate or correct 
it. Moreover, it cannot detect an even number of errors. By introducing more redundancy, it 
is possible not only to detect but also to correct errors. For example, for L — 16, it can be 
shown that properly adding three pulses will not only detect but also correct a single error 
occurring at any location. Details on the subject of error correcting codes will be discussed in 
Chapter 14. 


1.6 A BRIEF HISTORICAL REVIEW OF MODERN 
TELECOMMUNICATIONS 

Telecommunications (literally: communications at a distance) are always critical to human 
society. Even in ancient times, governments and military units relied heavily on telecommu¬ 
nications to gather information and to issue orders. The first type was with messengers on foot 
or on horseback; but the need to convey a short message over a large distance (such as one 
warning a city of approaching raiders) led to the use of fire and smoke signals. Using signal 
mirrors to reflect sunlight (heliography), was another effective way of telecommunication. Its 
first recorded use was in ancient Greece. Signal mirrors were also mentioned in Marco Polo's 
account of his trip to the Far East. 1 These ancient visual communication technologies are, 
amazingly enough, digital. Fires and smoke in different configurations would form different 
codewords. On hills or mountains near Greek cities there were also special personnel for such 
communications, forming a chain of regenerative repeaters. In fact, fire and smoke signal 
platforms still dot the Great Wall of China. More interestingly, reflectors or lenses, equivalent 
to the amplifiers and antennas we use today, were used to directionally guide the light farther. 

Naturally, these early visual communication systems were very tedious to set up and could 
transmit only several bits of information per hour. A much faster visual communication system 
was developed just over two centuries ago. In 1793 Claude Chappe of France invented and 
performed a series of experiments on the concept of “semaphore telegraph.” His system was a 
series of signaling devices called semaphores, which were mounted on towers, typically spaced 
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10 km apart. (A semaphore looked like a large human figure with signal flags in both hands.) A 
receiving semaphore operator would transcribe visually, often with the aid of a telescope, and 
then relay the message from his tower to the next, and so on. This visual telegraph became the 
government telecommunication system in France and spread to other countries, including the 
United States. The semaphore telegraph was eventually eclipsed by electric telegraphy. Today, 
only a few remaining streets and landmarks with the name “Telegraph Hiir remind us of the 
place of this system in history. Still, visual communications (via Aldis lamps, ship flags, and 
heliographs) remained an important part of maritime communications well into the twentieth 
century. 

These early telecommunication systems are optical systems based on visual receivers. 
Thus, they can cover only line-of-sight distance, and human operators are required to decode 
the signals. An important event that changed the history of telecommunication occurred in 
1820, when Hans Christian Oersted of Denmark discovered the interaction between electricity 
and magnetism, 2 Michael Faraday made the next crucial discovery, which changed the 
history of both electricity and telecommunications, when he found that electric current 
can be induced on a conductor by a changing magnetic field. Thus, electricity generation 
became possible by magnetic field motion. Moreover, the transmission of electric signals 
became possible by varying an electromagnetic field to induce current change in a distant 
circuit. The amazing aspect of Faraday's discovery on current induction is that it provides 
the foundation for wireless telecommunication over distances without line-of-sight, and more 
importantly, it shows how to generate electricity as an energy source to power such systems. 
The invention of the electric telegraph soon followed, and the world entered the modern electric 
telecommunication era. 

Modem communication systems have come a long way from their infancy. Since it 
would be difficult to detail all the historical events that mark the recent development of 
telecommunication, we shall instead use Table 1.1 to chronicle some of the most notable 
events in the development of modern communication systems. Since our focus is on electrical 
telecommunication, we shall refrain from reviewing the equally long history of optica! (fiber) 
communications. 

It is remarkable that all the early telecommunication systems are symbol-based digital 
systems. It was not until Alexander Graham BelTs invention of the telephone system that 
analog live signals were transmitted. Live signals can be instantly heard or seen by the receiving 
users. The Bell invention that marks the beginning of a new (analog communication) era is 
therefore a major milestone in the history of telecommunications. Figure L7 shows a copy of 
an illustration from Bell's groundbreaking 1876 telephone patent. Scientific historians often 
hail this invention as the most valuable patent ever issued in history. 

The invention of telephone systems also marks the beginning of the analog com¬ 
munication era and live signal transmission. On an exciting but separate path, wireless 
communication began in 1887, when Heinrich Hertz first demonstrated a way to detect 
the presence of electromagnetic waves. French scientist Edouard Branly, English physi¬ 
cist Oliver Lodge, and Russian inventor Alexander Popov all made important contributions 
to the development of radio receivers. Another important contributor to this area was 
the Croatian-born genius Nikola Tesla, Building upon earlier experiments and inventions, 
Italian scientist and inventor Guglielmo Marconi developed a wireless telegraphy sys¬ 
tem in 1895 for which he shared the Nobel Prize in Physics in 1909. Marconi's wireless 
telegraphy marked a historical event of commercial wireless communications. Soon, the mar¬ 
riage of the inventions of Bell and Marconi allowed analog audio signals to go wireless, 
thanks to amplitude modulation (AM) technology. Quality music transmission via FM radio 
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TABLE 1.1 

Important Events of the Past Two Centuries of Telecommunications 

Year Major Events 


1820 First experiment of electric current causing magnetism (by Hans C. Oersted) 

J 83 1 Discovery of induced current from electromagnetic radiation (by Michael Faraday) 
1830-32 Birth of telegraph (credited to Joseph Henry and Pavel Schilling) 

1837 Invention of Morse code by Samuel F B. Morse 

1864 Theory of electromagnetic waves developed by James C. Maxwell 

1866 First transatlantic telegraph cable in operation 

1876 Invention of telephone by Alexander G. Bell 

1878 First telephone exchange in New Haven, Connecticut 

1887 Detection of electromagnetic waves by Heinrich Hertz 

1896 Wireless telegraphy (radio telegraphy) patented by Guglielmo Marconi 

1901 First transatlantic radio telegraph transmission by Marconi 

1906 First amplitude modulation radio broadcasting (by Reginald A. Fessenden) 

1907 Regular transatlantic radio telegraph service 

1915 First transcontinental telephone service 

1920 First commercial AM radio stations 

1921 Mobile radio adopted by Detroit Police Department 

1925 First television system demonstration (by Charles F. Jenkins) 

1928 First television station W3XK in the United States 

1935 First FM radio demonstration (by Edwin H* Armstrong) 

1941 NTSC black and white television standard 
First commercial FM radio service 

1947 Cellular concept first proposed at Bell Labs 

1948 First major information theory paper published by Claude E. Shannon 
Invention of transistor by William Shockley, Walter Brattain, and John Bardeen 

1949 The construction of Golay code for 3 (or fewer) bit error correction 

1950 Hamming codes constructed for simple error corrections 

1953 NTSC color television standard 

1958 Integrated circuit proposed by Jack Kilby (Texas Instruments) 

1960 Construction of the powerful Reed-Solomon error correcting codes 

1962 First computer telephone modem developed: Bell Dataphone 103A (300 bit/s) 

1962 Low-density parity check error correcting codes proposed by Robert G. Gallager 

1968-9 First error correction encoders on board NASA space missions (Pioneer IX and Mariner VI) 
1971 First wireless computer network: AlohaNet 

1973 First portable cellular telephone demonstration to the U.S. Federal Communications 
Commission, by Motorola 
1978 First mobile eellular trial by AT&T 

1984 First handheld (analog) AMPS cellular phone service by Motorola 

1989 Development of DSL modems for high-speed computer connections 

1991 First (digital) GSM cellular service launched (Finland) 

First wireless local area network (LAN) developed (AT&T-NCR) 

1993 Digital ATSC standard established 

1993 Turbo codes proposed by Berrou, Glavieux, and Thitimajshima 

1996 First commercial CDMA (IS-95) cellular service launched 
First HDTV broadcasting 

1997 IEEE 802.11(b) wireless LAN standard 

1998 Large-scope commercial ADSL deployment 

1999 IEEE 802.11a wireless LAN standard 

2000 First 3G cellular service launched 


2003 IEEE 802.1 Ig wireless LAN standard 
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broadcast was first demonstrated by American inventor Major Edwin H t Armstrong. Arm¬ 
strong's FM demonstration in 1935 took place at an IEEE meeting in New York's Empire 
State Building. 

A historic year for both communications and electronics was 1948, the year that wit¬ 
nessed the rebirth of digital communications and the invention of semiconductor transistors. 
The rebirth of digital communications is owing to the originality and brilliance of Claude 
E. Shannon, widely known as the father of modem digital communication and information 
theory. In two seminal articles published in 1948, he first established the fundamental concept 
of channel capacity and its relation to information transmission rate. Deriving the channel 
capacity of several important models. Shannon 3 proved that as long as the information is 
transmitted through a channel at a rate below the channel capacity, error-free communications 
can be possible. Given noisy channels, Shannon showed the existence of good codes that can 
make the probability of transmission error arbitrarily small. This noisy channel coding theorem 
gave rise to the modem field of error correcting codes. Coincidentally, the invention of the 
first transistor in the same year (by Bill Shockley, Walter Brattain, and John Bardeen) paved 
the way to the design and implementation of more compact, more powerful, and less noisy 
circuits to put Shannon's theorems into practical use. The launch of Mariner IX Mars orbiter 
in March of 1971 was the first NASA mission officially equipped with error correcting codes, 
which reliably transmitted photos taken from Mars. 

Today, we are in an era of digital and multimedia communications, marked by the wide¬ 
spread applications of computer networking and cellular phones. The first telephone modem 
for home computer connection to a mainframe was developed by AT&T Bell Labs in 1962* It 
uses an acoustic coupler to interface with a regular telephone handset. The acoustic coupler 
converts the local computer data into audible tones and uses the regular telephone microphone 
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to transmit the tones over telephone lines. The coupler receives the mainframe computer data 
via the telephone headphone and converts them into hits for the local computer terminal, 
typically at rates below 300 bit/s. Rapid advances in integrated circuits (first credited to Jack 
Kilby in 1958) and digital communication technology dramatically increased the link rate to 
56 kbit/s by the 1990s. By 2000, wireless local area network (WLAN) modems were developed 
to connect computers at speed up to 11 Mbit/s. These commercial WLAN modems, the size 
of a credit card, were first standardized as IEEE 802.1 lb. 

Technological advances also dramatically reshaped the cellular systems. While the cellular 
concept was developed in 1947 at Bell Labs, commercial systems w ere not available until 1983. 
The “mobile” phones of the 1980s were bulky and expensive, mainly used for business. The 
world’s first cellular phone, developed by Motorola in 1983 and known as DynaTAC 8000X, 
weighed 28 ounces, earning the nickname of “brick” and costing $3995. These analog phones 
are basically two-way FM radios for voice only. Today, a cellphone is truly a multimedia, 
multifunctional device that is useful not only for voice communication hut also can send 
and receive e-mail, access websites, and display videos. Cellular devices are now very small, 
weighingnomorethanafew'ounces.Unlikeinthe past, cellular phones are now for the masses. 
In fact, Europe now has more cellphones than people. In Africa, 13% of the adult population 
now owns a cellular phone. 

Throughout history, the progress of human civilization has been inseparable from tech¬ 
nological advances in telecommunications. Telecommunications played a key role in almost 
every major historical event. It is not an exaggeration to state that telecommunications helped 
shape the very world we live in today'and will continue to define our future. It is therefore 
the authors' hope that this text can help stimulate the interest of many students in telecommu¬ 
nication technologies. By providing the fundamental principles of modern digital and analog 
communication systems, the authors hope to provide a solid foundation for the training of 
future generations of communication scientists and engineers. 
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Z SPACE 


I n this chapter we discuss certain basic signal concepts. Signals are processed by systems. 
We shall start with explaining the terms signals and systems* 

Signals 

A signal, as the term implies, is a set of information or data. Examples include a telephone or 
a television signal, the monthly sales figures of a corporation, and closing stock prices (e.g., in 
the United States, the Dow Jones averages). In all these examples, the signals are functions of 
the independent variable time. This is not always the case, however. When an electrical charge 
is distributed over a surface, for instance, the signal is the charge density, a function of space 
rather than time. In this book we deal almost exclusively with signals that are functions of 
time. The discussion, however, applies equally well to other independent variables. 

Systems 

Signals may be processed further by systems, which may modify them or extract additional 
information from them. For example, an antiaircraft missile launcher may want to know the 
future location of a hostile moving target, which is being tracked by radar. Since the radar 
signal gives the past location and velocity of the target, by properly processing the radar signal 
(the input), one can approximately estimate the future location of the target. Thus, a system 
is an entity that processes a set of signals (inputs) to yield another set of signals (outputs). 
A system may be made up of physical components, as in electrical, mechanical, or hydraulic 
systems (hardware realization), or it may be an algorithm that computes an output from an 
input signal (software realization). 


2.1 SIZE OF A SIGNAL 

Signal Energy 

The size of any entity is a quantity that indicates its strength. Generally speaking, a signal 
varies with time. To set a standard quantity that measures signal strength, we normally view 
a signal g{t) as a voltage across a one-ohm resistor. We define signal energy E g of the signal 
g(t) as the energy that the voltage g(f) dissipates on the resistor. More formally, we define E s 
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Figure 2.1 

Examples of 
signals. 

(a] Signal with 
finite energy. 

(b) Signal with 
finite power. 



(for a real signal) as 



This definition can be generalized to a comp lex-valued signal g(t) as 



(2T) 
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Signal Power 

To be a meaningful measure of signal size, the signal energy must be finite. A necessary 
condition for energy to be finite is that the signal amplitude goes to zero as \t\ approaches 
infinity (Fig* 2* la). Otherwise the integral in Eq. (2J) will not converge* 

If the amplitude of g(0 does not go to zero as \t\ approaches infinity (Fig* 2. lb), the signal 
energy is infinite. A more meaningful measure of the signal size in such a case would be the 
time average of the energy (if it exists), which is the average power P g defined (for a real 
signal) by 


1 f T/2 , 

Pg= lim - / g*(t)dt (2.3) 

T^oo T J_r /2 

We can generalize this definition for a complex signal g (/) as 

P g = lim 2 f T/2 \g{t)\ 2 dt (2.4) 

I J-j/2 

Observe that the signal power P g is the time average (mean) of the signal amplitude square, 
that is, the mean square value of g(t). Indeed, the square root of P g is the familiar rms (root 
mean square) value of g(t). 

The mean of an entity averaged over a large time interval approaching infinity exists if 
the entity either is periodic or has a statistical regularity. If such a condition is not satisfied, an 
average may not exist. For instance, a ramp signal g{r) = f increases indefinitely as \t\ —> oo, 
and neither the energy, nor the power exists for this signal. 

Units of Signal Energy and Power 

The standard units of signal energy and power are the joule and the watt. However, in practice, 
it is often customary to use logarithmic scales to describe signal power. This notation saves 
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the trouble of dealing with many decimal places when signal power is large or small. As a 
convention, a signal with average power of P watts can be said to have power of 

[10 ■ log 10 P] dBw or [30 -F 10 * \og {0 P] dBm 

For example, -30 dBm represents signal pow'er of 10 -6 W in normal decimal scale. 


Example 2,1 


Determine the suitable measures of the signals in Fig t 2,2. 
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The signal in Fig. 2.2a approaches 0 as |r| -* oo. Therefore, the suitable measure for this 
signal is its energy Eg, given by 

/ OC- pQ pOO 

g 2 (t)dt= (2 ) 2 dt+ 4e'"dt = 4+4 = 8 
-oc J -1 J 0 

The signal in Fig, 2 + 2b does not approach 0 as |f| -» oc. However, it is periodic, and 
therefore its power exists. We can use Eq. (2.3) to determine its power. For periodic signals, 
we can simplify the procedure by observing that a periodic signal repeats regularly each 
period (2 seconds in this case). Therefore, averaging g 2 (/} over an infinitely large interval 
is equivalent to averaging it over one period (2 seconds in this case). Thus 

Pg=l 2 L/ it)dt = U-/ dt=l l 



. g(') 
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Recall that the signal powder is the square of its rms value. Therefore, the rms value of this 
signal is 1/^3. 


2.2 CLASSIFICATION OF SIGNALS 

There are various classes of signals. Here we shall consider only the following pairs of classes, 
which are suitable for the scope of this book. 


1* Continuous time and discrete time signals 
2* Analog and digital signals 




3, Periodic and aperiodic signals 

4. Energy and power signals 

5* Deterministic and probabilistic signals 
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2.2.1 Continuous Time and Discrete Time Signals 

A signal that is specified for every value of time t (Fig. 2.3a) is a continuous time signal, and 
a signal that is specified only at discrete points of t — nT (Fig. 2.3b) is a discrete time signal. 
Audio and video recordings are continuous Lime signals, whereas the quarterly gross domestic 
product (GDP), monthly sales of a corporation, and stock market daily averages are discrete 
time signals, 

2.2.2 Analog and Digital Signals 

One should not confuse analog signals with continuous time signals. The two concepts are 
not the same. This is also true of the concepts of discrete time and digital A signal whose 
amplitude can take on any value in a continuous range is an analog signal. This means that 


Figure 2.3 

(a) Continuous 
time signal 

(b) Discrete time 
signals. 
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Figure 2.4 

Examples of 
signals: (a] 
analog and 
continuous Nme, 
(b) digital and 
continuous time, 
[c( analog and 
discrete time, 

(d) digital and 
discrete time. 



(c) 




an analog signal amplitude can take on an (uncountably) infinite number of values. A digital 
signal, on the other hand, is one whose amplitude can take on only a finite number of values. 
Signals associated with a digital computer are digital because they take on only two values 
(binary signals). For a signal to qualify as digital, the number of values need not be restricted 
to two. It can be any finite number. A digital signal whose amplitudes can take on M values is 
an M -ary signal of which binary (M = 2) is a special case. The terms “continuous time" and 
“discrete time” qualify the nature of signal along the time (horizontal) axis. The terms “analog" 
and “digital," on the other hand, describe the nature of the signal amplitude (vertical) axis. 
Figure 2.4 shows examples of signals of various types. It is clear that analog is not necessarily 
continuous time, whereas digital need not be discrete time. Figure 2.4c shows an example 
of an analog but discrete time signal. An analog signal can be converted into a digital signal 
(via analog-to-digital, or A/D, conversion) through quantization (rounding off), as explained 
in Chapter 6. 


2.2.3 Periodic and Aperiodic Signals 

A signal g(t) is said to be periodic if there exists a positive constant Tq such that 

§(t) = g(t + T 0 ) for all t (2.5) 

The smallest value of 7b that satisfies the periodicity condition of Eq. (2.5) is the period of 
g(i). The signal in Fig. 2.2b is aperiodic signal with period of 2. Naturally, a signal is aperiodic 
if it is not periodic. The signal in Fig. 2.2a is aperiodic. 

By definition, a periodic signal g(r) remains unchanged when time-shifted by one period. 
This means that a periodic signal must start at t = —oo because if it starts at some finite instant, 
say, t = 0 y the time-shifted signal g(t - J- 7b) will start at t — -7b and g(t -b 7b) would not 
be the same as g(t). Therefore, a periodic signal, by definition, must start from “oo and 
continue forever, as shown in Fig. 2,5. Observe that a periodic signal shifted by an integral 
multiple of To remains unchanged. Therefore, g(t) may be considered to be a periodic signal 
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with period mTo , where m is any integer However, by definition, the period is the smallest 
interval that satisfies periodicity condition of Eq. (2*5), Therefore, 7 q is the period 

2.2,4 Energy and Power Signals 

A signal with finite energy is an energy signal, and a signal with finite power is a power signal. 
In other words, a signal g(r) is an energy signal if 

/ oo 

\g(t)\ 2 dt<oo (2.6) 

-OO 

Similarly, a signal with a finite and nonzero power (mean square value) is a power signal. In 
other words, a signal is a power signal if 


0 < lim ^ f \g(t)\ 2 dt<oo (2.7) 

T^cc T J- Tl% 

The signals in Fig. 2.2a and 2.2b are examples of energy and power signals, respectively. 
Observe that power is time average of the energy. Since the averaging is over an infinitely 
large interval, a signal with finite energy has zero power, and a signal with finite power has 
infinite energy. Therefore, a signal cannot be both an energy and a power signal. If it is one, 
it cannot be the other. On the other hand, some signals with infinite power are neither energy 
nor power signals. The ramp signal is one example. 

Comments 

Every signal observed in real life is an energy signal A power signal, on the other hand , 
must have an infinite duration. Otherwise its power, which is its average energy (averaged 
over infinitely large interval) will not approach a (nonzero) limit. Obviously it is impossible 
to generate a true power signal in practice because such a signal would have infinite duration 
and infinite energy. 

Also, because of periodic repetition, periodic signals for which the area under |g(0l 2 over 
one period is finite are power signals; however, not all power signals are periodic. 

2.2.5 Deterministic and Random Signals 

A signal whose physical description is known completely, either in a mathematical form or a 
graphical form is a deterministic signal. A signal that is known only in terms of probabilistic 
description, such as mean value, mean square value, and distributions, rather than its full math¬ 
ematical or graphical description is a random signal. Most of the noise signals encountered in 
practice are random signals. All message signals are random signals because, as will be shown 




26 SIGNALS AND SIGNAL SPACE 


Figure 2,6 A 
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later, a signal, to convey information, must have some uncertainty (randomness) about it. The 
treatment of random signals will be discussed in later chapters. 


2.3 UNIT IMPULSE SIGNAL 

The unit impulse function 5(0 is one of the most important functions in the study of signals 
and systems. Its definition and application provide much convenience that is not permissible 
in pure mathematics. 

The unit impulse function S(t) was first defined by R A. M. Dirac (hence often known as 
the “Dirac delta”) as 


5(0 =0, t # 0 



5 ( 0 * = 1 


( 2 , 8 ) 

(2.9) 


We can visualize an impulse as a tall, narrow rectangular pulse of unit area, as shown in Fig, 2,6. 
The width of this rectangular pulse is a very small value e; its height is a very large value 1/e 
in the limit as e 0. The unit impulse therefore can be regarded as a rectangular pulse with a 
width that has become infinitesimally small, a height that has become infinitely large, and an 
overall area that remains constant at unity.* Thus, 5(0 = 0 everywhere except at t = 0, where 
it is, strictly speaking, undefined. For this reason, a unit impulse is graphically represented by 
the spearlike symbol in Fig. 2.6a. 


Multiplication of a Function by an Impulse 

Let us now consider what happens when we multiply the unit impulse 5(0 by a function 0(0 
that is known to be continuous at t = 0. Since the impulse exists only at t = 0, and the value 
of 0(r) at t = 0 is 0(0), we obtain 


0 (0<5(0 =0(O)<5(O (2.10a) 

Similarly, if 0(0 is multiplied by an impulse 5 (t - T) (an impulse located at / = T), then 

0(oaa-r) = 0 (r) 5 (/-D (2.10b) 

provided 0(0 is defined at t = 7\ 


* The impulse function can also be approximated by other pulses, such as a positive triangle, an exponential pulse, 
ora Gaussian pulse. 
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Figure 2,7 
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The Sampling Property of the Unit Impulse Function 

From Eq. (2.10) it follows that 

/ OO />£» 

4>(t)S(t-T)dt = </>(T) / &{t-T)dt = 0(7) (2.11a) 

provided 00) is continuous at t = T. This result means that the area under the product of 
a function with an impulse S(t) is equal to the value of that function at the instant where the 
unit impulse is located. This very important and useful property is known as the sampling (or 
sifting) property of the unit impulse. 

Depending oil the value of T and the integration limit, the impulse function may or may 
not be within the integration limit. Thus, it follows that 


jf(mt-T)dt = <P(T)jl’s(t-T)dt=^ 0< o r) T 


a <T < b 

n < h nr T > 


The Unit Step Function u(t) 

Another familiar and useful function is the unit step function u(t), often encountered in circuit 
analysis and defined by Fig. 2.7a: 


,, I 

«U) = ■ 0 


t > 0 

r <0 


( 2 , 12 ) 


If we want a signal to start at t = 0 (so that it has a value of zero for t < 0), we need only 
multiply the signal by u{t). A signal that starts after t — 0 is called a causal signal. In other 
words, g(t) is a causal signal if 

g (t) =0 / < 0 

The signal e~ at represents an exponential that starts at / = —oo. If we want this signal to start 
at t = 0 (the causal form), it can be described as e~ at u(t) (Fig. 2.7b). From Fig. 2.6b, we 
observe that the area from — oo to t under the limiting form of S(t) is zero if t < 0 and unity 
if t > 0. Consequently, 



S(z)dT 


0 , 

1, 


= «(*) 


/ < 0 
t >0 


(2.13a) 
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From this result it follows that 



(2.13b) 


2.4 SIGNALS VERSUS VECTORS 

There is a strong connection between signals and vectors. Signals that are defined for only a 
finite number of time instants (say N) can be written as vectors (of dimension N). Thus, consider 
a signal g(t) defined over a closed time interval [ a , h]. Let we pick N points uniformly on the 
lime interval [a, h ] such that 

^ 

t\ = a, — a + €, f 3 = a + 2e, = a 4- (N - l)e = b, f — -- 

N — l 

Then we can write a signal vector g as an -dimensional vector 

g = [ £(fl) £(*2) ‘ ' ' g(fjv) ] 

As the number of time instants N increases, the sampled signal vector g wilt grow. Eventually, 
asJV —* oo, the signal values will form a vector g of infinitely long dimension. Because f. —^ 0, 
the signal vector g will transform into the continuous-time signal g(t) defined over the interval 
\a , b\. Tn other words, 

lim g = g( t) / e [a, b ] 

This relationship clearly shows that continuous time signals are straightforward generalizations 
of finite dimension vectors. Thus, basic definitions and operations in a vector space can be 
applied to continuous time signals as well. We now highlight this connection between the finite 
dimension vector space and the continuous time signal space. 

We shall denote all vectors by boldface type. For example, x is a certain vector with 
magnitude or length ||x|j. A vector has magnitude and direction. In a vector space, we can 
define the inner (dot or scalar) product of two real-valued vectors g and x as 

< g, x >= ||g|| * ||x||cos 0 (2.14) 

where 0 is the angle between vectors g and x. By using this definition, we can express ||x||, 
the length (norm) of a vector x as 


||x|| 2 = < X, X > 


(2.15) 


This defines a normed vector space. 

2.4.1 Component of a Vector along Another Vector 

Consider two vectors g and x, as shown in Fig. 2.8. Let the component of g along x be cx. 
Geometrically the component of g along x is the projection of g on x, and is obtained by 
drawing a perpendicular from the tip of g on the vector x, as shown in Fig. 2.8. What is the 
mathematical significance of a component of a vector along another vector? As seen from 
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Figure 2.8 

Component 
(projection) of a 
vector along 
another vector. 
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Figure 2.9 

Approxi motions 
of a vector in 
terms of another 
vector. 




Fig, 2,8* the vector g can be expressed in terms of vector x as 

g = cx + e (2,16) 

However, this does not describe a unique way to decompose g in terms of x and e. Figure 2.9 
shows two of the infinite other possibilities. From Fig. 2.9a and b, we have 

g = C]\ + ei = C2\ + (2,17) 

The question is: Which is the “best” decomposition? The concept of optimality depends on 
what we wish to accomplish by decomposing g into two components. 

In each of these three representations, g is given in terms of x plus another vector called 
the error vector. If our goal is to approximate g by cx (Fig. 2.8), 


g ce g = cx (2,18) 

then the error in this approximation is the (difference) vector e = g - cx. Similarly, the errors 
in approximations of Fig, 2,9a and b are ei and e 2 , respectively. The approximation in Fig, 2.8 
is unique because its error vector is the shortest (with the smallest magnitude or norm). We 
can now define mathematically the component (or projection) of a vector g along vector x to 
be cx, where c is chosen to minimize the magnitude of the error vector e — g — cx. 

Geometrically, the magnitude of the component of g along x is j|g|J cos $, which is also 
equal to c| |x||. Therefore 

c||x|| = |]g|[cos e 

Based on the definition of inner product between two vectors, multiplying both sides by ||xj| 
yields 

c||x|| 2 = Ngll ||x||cos e =< g, X > 

and 


< g, x > _ 1 

< X, X > I |x| I 2 




< g, X > 


(2.19) 
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From Fig. 2.8, it is apparent that when g and x are perpendicular, or orthogonal, then g has 
a zero component along x; consequently, c — 0. Keeping an eye on Eq + (2.19), we therefore 
define g and x to be orthogonal if the inner (scalar or dot) product of the two vectors is zero, 
that is, if 

< g, x >= 0 (2.20) 

2.4.2 Decomposition of a Signal and 
Signal Components 

The concepts of vector component and orthogonality can be directly extended to continuous 
time signals. Consider the problem of approximating a real signal g(t) in terms of another real 
signal jc(f) over an interval [fi, ^1- 


g(t) - cx(t) 

The error e(t) in this approximation is 


t[ <t <t2 


?(0 


■I 


gti) - exit) 
0 


fi < t < t 2 
otherwise 


( 2 . 21 ) 


( 2 . 22 ) 


For “best approximation,” we need to minimize the error signal, that is, minimize its norm. 
Minimum signal norm corresponds to minimum energy E e over the interval 0i, ti] given by 


E* 



e 2 {t) dt 
[g(f)-c*{f )] 2 


dt 


Note that the right-hand side is a definite integral with t as the dummy variable. Hence E e is a 
function of the parameter c (not f), and E e is minimum for some choice of c. To minimize E e , 
a necessary condition is 


dE e 

dc 


= 0 


(2,23) 


or 


nr. 


[g(t) ~cx{t)] 2 dt 


= 0 


Expanding the squared term inside the integral, we obtain 

d r m 


dc 


" r*2 i d r r t2 i a r c t 

j g 2 (t)dt 2c I g(t)x(t)dt + — c 2 / 

Uti J «c L Jt, J dc L Jr, 


x 2 (t)dt 


= 0 


f f 2 ft2 

-2 / g(t)x(t)dt + 2c / x 2 (t)dt = 0 
Jii Jt\ 


from which we obtain 
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and 


_ /,f g(t)x(t) dt 1 
= E~ X 

To summarize our discussion, if a signal g(t) is 

£(0 - cx(t) 

then the optimum value of c that minimizes the energy of the error signal in this approximation 
is given by Eq. (2.24) t 

Taking our cue from vectors, we say that a signal g(t) contains a component cjt(r), where 
c is given by Eq. (2.24), As in vector space, cx(r) is the projection of g(t) on x(t). Consistent 
with the vector space terminology we say that if the component of a signal g(t) of the form 
x(r) is zero (i.e., c = 0), the signals g(t) and x(t) are orthogonal over the interval [fi, ( 2 ]. In 
other words, with respect to real-valued signals, two signals x(t) and g(t) are orthogonal when 
there is zero contribution from one signal to the other (i.e., c = 0). Thus, x(f) and g(t) are 
orthogonal if and only if 


Jt ] 


(2.24) 

approximated by another signal x(0 as 


/' 

Jt 1 


g(t)x(t) dt — 0 


(2.25) 


Based on the illustrations of vectors in Fig. 2.9, we can say that two signals are orthogonal if 
and only if their inner product is zero. This relationship indicates that the integral of Eq. (2.25) 
is closely related to the concept of an inner product between vectors. 

Indeed, the standard definition of the inner product of two N -dimensional vectors g and x 


N 

< g, x >= y] gixi 
1=1 


is almost identical in form to the integration ofEq. (2.25). We therefore define the inner product 
of two (real-valued) signals g(t) and x(/), both defined over a time interval [t \, t 2 ] t as 


< g(t), x(t) >= f g(t)x{t)dt (2.26) 

Jti 


Recall from algebraic geometry that the square of a vector length ||x|| 2 is equal to < x, x 
Keeping this concept in mind and continuing our analogy with vector analysis, we define the 
the norm of a signal g(t) as 


llg(0ll = V<S<0. S(0> (2.27) 

which is the square root of the signal energy in the time interval. It is therefore clear that the 
norm of a signal is analogous to the length of a finite dimensional vector. More generally, 
signals may not be merely defined over a continuous segment [/ 1 , tj]* 


* Indeed, the signal space under consideration may be over a set of time segments represented simply by 0. For 
such a more general space of signals, the inner product is defined as an integral over the time domain 0. For 
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Example 2.2 For the square signal g(t) shown in Fig. 2.10 find the component in g{t) of the form of sin t. 
In other words, approximate g(t ) in terms of sin t : 

g(t) ^ csin t 0 < t < 2tt 

so that the energy of the error signal is minimum. 


Figure 2.10 

Approx i mot ion 
or square signal 
in terms of a 
single sinusoid. 



i In this case 


x(f) = sin t and 


From Eq. (2,24), we find 



sin 2 {/)t/f = jt 


c 


1 C lK 1 

— / g(t) sin tdt = — 
X JO 7T 



p2jt 

sin t dt 4- / (— sin t)dt 


4 

n 


(2.29) 


Therefore 


4 . 

g(t) ^ — sin t (2.30) 

7T 

represents the best approximation of g(r) by the function sin r, which will minimize the 
error signal energy. This sinusoidal component of g(f) is shown shaded in Fig. 2.10. As 
in vector space, we say that the square function g(0 shown in Fig. 2.10 has a component 
of signal sin t with magnitude of 4 /jt. 


2.4.3 Complex Signal Space and Orthogonality 

So far we have restricted our discussions to real functions of f. To generalize the results to 
complex functions of t, consider again the problem of approximating a function g(t) by a 


complex valued signals, the inner product is modified into 

< x(t) >= f g(t)x*(t)dt 
Jo 


( 2 . 28 ) 


Given the inner product definition, the signal norm ||g(r)| | = v /< £(0> g(0 > and the signal space can be defined 
for any time domain signal. 
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function *(f) over an interval (fi < t < t 2 ) 

g(t)~cx(t) (2.31) 

where ^(f) and^(r) are complex functions of t. In general, both the coefficient c and the error 

e{t) = g(t) - exit) (2.32) 

are complex. Recall that the energy E x of the complex signal xi t) over an interval [.'i, r 2 ] is 

E x = f WO I 2 ** 

i 

For the best approximation, we need to choose c that minimizes E e , the energy of the error 
signal e(t) given by 


Recall also that 


E e 



|g(0 - cx(t)\ 2 dt 


(2.33) 


|u + v| 2 = (« + v)(«* + V*) = |«| 2 + |v[ 2 + H*v 4- MV* (2.34) 


Using this result, we can, after some manipulation, express the integral E e in Eq. (2.33) as 


E e 



IWO fdt 


Lf 


g(t)x*(t)dt 




1 f t2 

-7= / 
\E X Jr] 


Since the first two terms on the right-hand side are independent of c , it is clear that E e is 
minimized by choosing c such that the third term is zero. This yields the optimum coefficient 


c = ~ V g(t)x*(t)dt (235) 

Ex Jt 1 

In light of the foregoing result, we need to redefine orthogonality for the complex case as 
follows: complex functions (signals) X} (f) and xj (0 are orthogonal over an interval (t < q < 
t 2 ) as long as 


ft2 ft2 

/ (0*5(0 dt = 0 or / **(0*2(0 dt 

Jt\ Jt . 


(2-36) 


In fact, either equality suffices. This is a general definition of orthogonality, which reduces to 
Eq, (2.25) when the functions are real. 

Similarly, the definition of inner product for complex signals over a time domain 0 can 
be modified: 


< #(0> x(t) >= / g(t)x*(t)dt (2.37) 

Consequently, the norm of a signal g(t) is simply 


lls(0ll = 


DC 


-|l/2 


I g{t)\ z dt 


{ere©} 


(2.38) 
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2.4.4 Energy of the Sum of Orthogonal Signals 

We know that the geometric length (or magnitude) of the sum of two orthogonal vectors is 
equal to the sum of the magnitude squares of the two vectors. Thus, if vectors x and y are 
orthogonal, and if z = x + y, then 


l|z|| 2 = l|x|| 2 + ||y|l 2 

We have a similar result for signals. The energy of the sum of two orthogonal signals is equal 
to the sum of the energies of the two signals. Thus, if signals x(t) and y(f) are orthogonal over 
an interval [t i, ( 2 ], and if z(t) = x(t) 4 y(f), then 

E Z =E X +E y (2.39) 

We now prove this result for complex signals of which real signals are a special case. From 
Eq. (2.34) it follows that 

f 1 \x(t)+y(t)\ 2 dt= f 2 \x(t)\ 2 dt+ f 1 \y(t)\ 2 dt+ f‘ x{t)y*(t)dt + f ‘ x\l)y{l)dt 
Jt 1 Jt\ Ji\ h\ Jt x 

= f 2 \x(t)\ 2 dt+ f 2 \y(t)\ 2 dt (2.40) 

Ji] h 1 

The last equality follows because, as a result of orthogonality, the two integrals of the cross 
products jt(0y* (0 and ^*(f)y(f) are zero. This result can be extended to sum of any number of 
mutually orthogonal signals. 


2.5 CORRELATION OF SIGNALS 


By defining the inner product and the norm of signals, we paved the foundation for signal 
comparison. Here again, we can benefit by drawing parallels to the familiar vector space. Two 
vectors g and x are similar if g has a large component along x. In other words, if c inEq. (2.19) 
is large, the vectors g and x are similar. We could consider c to be a quantitative measure of 
similarity between g and x. Such a measure, however, would be defective because it varies 
with the norms (or lengths) of g and x. To be fair, the amount of similarity between g and x 
should be independent of the lengths of g and x. If we double the length of g, for example, 
the amount of similarity between g and x should not change. From Eq. (2.19), however, we 
see that doubling g doubles the value of c (whereas doubling x halves the value of c). The 
similarity measure based on signal correlation is clearly faulty. Similarity between two vectors 
is indicated by the angle 8 between the vectors. The smaller the 9 t the larger the similarity, and 
vice versa. The amount of similarity can therefore be conveniently measured by cos 9. The 
larger the cos 8 , the larger the similarity between the two vectors. Thus, a suitable measure 
would be p = cos 8 y which is given by 


p = cos 0 = 


< g> x > 

llgllllxll 


(2.41) 


We can readily verify that this measure is independent of the lengths of g and x. This 
similarity measure p is known as the correlation coefficient. Observe that 


-1 <p < 1 


(2.42) 
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Thus, the magnitude oi pis never greater than unity. If the two vectors are aligned, the similarity 
is maximum (p = 1). Two vectors aligned in opposite directions have maximum dissimilarity 
(p = — I)* If the two vectors are orthogonal, the similarity is zero. 

We use the same argument in defining a similarity index (the correlation coefficient) for 
signals. For convenience, we shall consider the signals over the entire time interval from —oo 
to oo. To establish a similarity index independent of energies (sizes) of g(t) and x(r), we 
must normalize c by normalizing the two signals to have unit energies. Thus, the appropriate 
similarity index p analogous to Eq. (2.41) is given by 


1 r™ 

P = / SiOrtOdt (2.43) 

Observe that multiplying either g(t) or x{t) by any constant has no effect on this index. Thus, 
it is independent of the size (energies) of g(t) and x(t). Using the Cauchy-Schwarz inequality 
(proved in Appendix B)/ one can show' that the magnitude of p is never greater than l: 


-1 < P < 1 (2.44) 

2.5.1 Correlation Functions 


We should revisit the application of correlation to signal detection in a radar unit, where a 
signal pulse is transmitted to detect a suspected target. By detecting the presence or absence 
of the reflected pulse, w f e confirm the presence or absence of the target. By measuring the time 
delay between the transmitted and received (reflected) pulse, we determine the distance of the 
target. Let the transmitted and the reflected pulses be denoted by g(t ) and z{t), respectively. If 
we were to use Eq. (2.43) directly to measure the correlation coefficient we would obtain 


P = 



z(t)g*(t)dt = 0 


(2*45) 


Thus, the correlation is zero because the pulses are disjoint (nonoverlapping in time). The 
integral in Eq. (2.45) will yield zero even when the pulses are identical but with relative time 
shift. To avoid this difficulty, we compare the received pulse z{t) with the transmitted pulse 
g(r) shifted by r. If for some value of r, there is a strong correlation, we not only detect the 
presence of the pulse but we also detect the relative time shift of z(0 with respect to g(t). For 
this reason, instead of using the integral on the right-hand side, we use the modified integral 
^(r), the cross-correlation function of two complex signals g(t) and z(t), defined by 





z(t)g*(t -r)dt = 



z{t + r)g*(t)dt 


(2.46) 


Therefore, ir gz (j) an indication of similarity (correlation) of g(t) with z(t) advanced (left- 
shifted) by t seconds. 


f The Cauchy-Schwarz inequality states that for two real energy signals g(t) andjcfr), {/.^ £(0*0)4^ 2 < E g E x 

with equality if and only if x{t) = where K is an arbitrary constant. There is similar inequality for complex 
signals. 
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Figure 2*11 

Physical 
explanation 
of the 

autocorrelation 

function. 




2.5.2 Autocorrelation Function 


As shown in Fig. 2.11, correlation of a signal with itself is called the autocorrelation. The 
autocorrelation function of a real signal g(t) is defined as 


= 



g(t)g(t+ z)dr 


(2*47) 


It measures the similarity of the signal g(t) with its own displaced version. In Chapter 3, we 
shall show that the autocorrelation function provides valuable spectral information about the 
signal. 


2.6 ORTHOGONAL SIGNAL SET 

In this section we show a way of representing a signal as a sum of orthogonal set of signals. In 
effect, the signals in this orthogonal set form a basis for the specific signal space. Here again 
we can benefit from the insight gained from a similar problem in vectors. We know that a 
vector can be represented as a sum of orthogonal vectors, which form the coordinate system 
of a vector space. The problem in signals is analogous, and the results for signals are parallel 
to those for vectors. For this reason, let us review the case of vector representation. 

2.6.1 Orthogonal Vector Space 

Consider a multidimensional Cartesian vector space described by three mutually orthogo¬ 
nal vectors xi, X 2 , and X 3 , as shown in Fig. 2.12 for the special case of three-dimensional 
vector space. First, we shall seek to approximate a three-dimensional vector g in terms of two 
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orthogonal vectors X] and x 2 ; 


g C[\ i -f-c 2 x 2 

The error e in this approximation is 


e = g - (ciX| + <? 2 x 2 ) 


or equivalently. 


g = ciX] + c 2 \ 2 + e 

In accordance with our earlier geometrical argument, it is dear from Fig. 2,12 that the length 
oferror vector e is minimum when it is perpendicular to the (xi, x 2 ) plane, and when C]\i and 
c 2 X 2 are the projections (components) of g on xi and x 2 , respectively. Therefore, the constants 
ci and C 2 are given by formula in Eq. (2.19). 

Now let us determine the best approximation to g in terms of all the three mutually 
orthogonal vectors \\ , x 2 , and X3: 


g ^ cjx 1 + C 2 X 2 -h C 3 X 3 (2.48) 

Figure 2.12 shows that a unique choice of c\ t c 2 , and C 3 exists, for which (2.48) is no longer 
an approximation hut an equality: 


g = C\X[ +C2X2 + C3X3 


In this case, c\X],c 2 X- 2 , and C 3 X 3 are the projections (components) of g on xi,x 2 , and X 3 , 
respectively. Note that the approximation error e is now zero when g is approximated in terms 
of three mutually orthogonal vectors: X] , x 2? and X3, This is because g is a three-dimensional 
vector, and the vectors X], x 2 , and X 3 represent a complete set of orthogonal vectors in three- 
dimensional space. Completeness here means that it is impossible in this space to find any other 
vector x 4 , which is orthogonal to all the three vectors xi, x 2 , and x 2 . Any vector in this space 
can therefore be represented (with zero error) in terms of these three vectors. Such vectors are 
known as basis vectors, and the set of vector is known as a complete orthogonal basis of 
this vector space. It a set of vectors f x>} is not complete, then the approximation error will 
generally not be zero. For example, in the three-dimensional case just discussed earlier, it is 
generally not possible to represent a vector g in terms of only two basis vectors without an 
error. 

The choice of basis vectors is not unique. In fact, each set of basis vectors corresponds to a 
particular choice of coordinate system. Thus, a three-dimensional vector g may be represented 
in many different ways depending on the coordinate system used. 

To summarize, if a set of vectors (x/} is mutually orthogonal, that is, if 


< x m , x n > 


0 m ^ n 

|x ,„| 2 m = n 


and if this basis set is complete, a vector g in this space can be expressed as 


g = C'lX] +C2X2 +C3X3 


(2.49) 
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where the constants q are given by 


< g, X; > 

< X,- t X| > 


1 


2 


< g. x; > 


i = 1, 2, 3 


(2.50a) 

(2.50a) 


2.6.2 Orthogonal Signal Space 

We continue with our signal approximation problem, using clues and insights developed for 
vector approximation. As before, we define orthogonality of asignal setjqO), *2(0* ... xn(0 
over a time domain © {may be an interval [ft, * 2 ]) as 

/.„*■={e„ Hi < 2 ' si > 

If all signal energies are equal E n = 1, then the set is normalized and is called an orthonormal 
set An orthogonal set can always be normalized by dividing x n (t) by *J~E~ n for all n. Now, con¬ 
sider the problem of approximating a signal g{r) over the © by a set of /V mutually orthogonal 
signals xi(t),x 2 (t) . x N (t): 


g(t) ^ C]X[(t) + C2X2U) + ■ ■ + c N x N (t) (2.52a) 

N 

= (2.52b) 


It can be shown that E e , the energy of the error signal e(t) in this approximation, is minimized 
if we choose 


Cn 



*= l,2,...,tf 


(2.53) 


Moreover, if the orthogonal set is complete, then the error energy E e 0, and the represen¬ 
tation in (2,52) is no longer an approximation, but an equality. More precisely, let the AMenn 
approximation error be defined by 


N 

e/v(0 = 5(0 - C|JC| (0 + C2*2(0 H- \-c N x N (t) = g(t ) - ^^nJCji(0 f e 0 (2.54) 

n= 1 


If the orthogonal basis is complete, then the error signal energy converges to zero; that is, 


lim / \e^(t)\ 2 dt = 0 


(2,55) 
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In a strictly mathematical sense, however, a signal may not converge to zero even though 
its energy does. This is because a signal may be nonzero at some isolated points.* Still, for 
all practical purposes, signals are continuous for all r, and the equality (2.55) states that the 
error signal has zero energy as N -*■ oo. Thus, for JV -> oo, the equality (2.52) can be 
loosely written as 


J?(0 = ciJri(f) + c 2 ;c2(f) H -1- c n x„(t) H- 

OO 

= X/WO te® (2,56) 

H=1 


where the coefficients c n are given by Eq. (2.53). Because the error signal energy approaches 
zero, it follows that the energy of g(t) is now equal to the sum of the energies of its orthogonal 
components. 

The series on the right-hand side of Eq. (2.56) is called the generalized Fourier series of 
gO) with respect to the set When the set (x„(/)} is such that the error energy £ |V 0 

as AT oo for every member of some particular signal class, we say that the set (^(f)} is 
complete on {i : 0} for that class of g(r), and the set {x H (/}} is called a set of basis functions 
or basis signals. In particular, the class of (finite) energy signals over 0 is denoted as Z, 2 {0} + 
Unless otherwise mentioned, in the future we shall consider only the class of energy signals. 


2.6.3 Parseval’s Theorem 

Recall that the energy of the sum of orthogonal signals is equal to the sum of their energies. 
Therefore, the energy of the right-hand side of Eq. (2.56) is the sum of the energies of the 
individual orthogonal components. The energy of a component c n x n {t) is c;Zv Equating the 
energies of the two sides of Eq. (2.56) yields 

Eg = c\E\ 4- c\E2 + C 3 E 3 4- ■* ■ 

= E e * £ » ( 2 - 57 ) 


This important result goes by the name of ParsevaEs theorem. Recall that the signal energy 
(area under the squared value of a signal) is analogous to the square of the length of a vector in 
the vector-signal analogy. In vector space we know that the square of the length of a vector is 
equal to the sum of the squares of the lengths of its orthogonal components. ParsevaTs theorem 
[Eq, (2.57)] is the statement of this fact as applied to signals. 


2.7 THE EXPONENTIAL FOURIER SERIES 

We noted earlier that orthogonal signal representation is NOT unique. While the tradi¬ 
tional trigonometric Fourier series allows a good representation of all periodic signals, here 
we provide an orthogonal representation of periodic signals that is equivalent but has a 
simpler form. 


Known as a measure-zero set. 
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Example 2,3 


Figure 2.13 

A periodic 
signal. 


First of all, it is clear that the set of exponentials e^ no ^ (n — 0, =bl, ±2,...) is orthogonal 
over any interval of duration lb = Itt/ojo , that is, 


( e' mw ' i, (ei n ‘ ai ' ! )*dt = [ = ( 2 , (2.58) 

To h I r ° m = n 


Moreover, this set is a complete set. 1,2 From Eqs. (2.53) and (2.56), it follows that 
a signal git) can be expressed over an interval of duration To second(s) as an 
exponential Fourier series 

OO 

g(0= £ 

oo 

= D n e inlnf[)t (2.59) 

n=-oo 

where [see Eq. (2.53)] 


D„ = ~ f g(t)e- jn2 * fnl dt (2.60) 

T() Jt 0 

The exponential Fourier series in Eq. (2.59) consists of components of the form e jn2 ^° t with 
n varying from -oo to oo. It is periodic with period 7b- 


Find the exponential Fourier series for the signal in Fig. 2.13b. 



i 


In this case. 7 q = tt. Info — IxfTo — 2, and 

oo 


s 
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where 


D„ 


zp" [ V(t)e j2m dt 
r o Jt 0 

~ f* e-'V e-W dt 
x Jo 

- [* e~ { i + -i 2n)t dt 
X Jo 


n +j2n) 
0.504 

1 +J4n 


e ^ 


({ +j2n)t 


and 


00 . 

= 0,504 V - e> 2nt 


= 0,504 


1 + 


1+74 


J 2i . 


e'' 4 ' + 


1 +j 8 


1 


+ -- —e J 2 ’ + 

t ~j4 


1 -J 8 


e~ jAr + 


I +J12 
1 


e i6) + - ■ 


1 ~j 12 


e^ 6 ' + ■ 


(2.61) 


(2.62a) 


(2.62b) 


Observe that the coefficients D n are complex. Moreover, £>„ and D-„ are conjugates, as 
expected. 


Exponential Fourier Spectra 

In exponential spectra, we plot coefficients D n as a function of But since D n is complex in 
general, we need two plots: the real and the imaginary parts of D n or the amplitude (magnitude) 
and the angle of D n > We prefer the latter because of its close connection to the amplitudes and 
phases of corresponding components of the trigonometric Fourier series. We therefore plot 
I Ail versus co and £D n versus co. This requires that the coefficients O n be expressed in polar 
form as \D n \^^° n r 

For a real periodic signal, the twin coefficients D n and D- n are conjugates, 


\D n \ - |Z>-nl (2.63a) 

lD n = 0 H and = -9 n (2.63b) 

Thus, 

D n = \D n \^ n and D^ n = \D n \e^ }0fl (2.64) 

Note that |Z) H | are the amplitudes (magnitudes) and U) n are the angles of various expo¬ 
nential components. From Eq. (2.63) it follows that the amplitude spectrum (]D n l vs./) is an 
even function of co and the angle spectrum {/.D n vs,/) is an odd function of/ when g(t) is a 
real signal. 
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For the series in Example 2.3, for instance, 


and 


Do - 0.504 


Di 
D-1 


0.504 
1 +J4 
0.504 
i-y4 


0A22e- J7596 ° 


0.122e' 75 - 96t ' 


D 2 = 


D-2 = 


0.504 

l+ji 

0.504 


0.0625* - ' 82 ' 87 " 


0.0625e' 82 ' 87 ° 


s- |£>i| =0.122, ZD i = -75.96° 
|Z)_i| =0.122, ZD_j = 75.96° 

> |Dil = 0.0625, ZD 2 = -82.87° 
|D_ 2 | = 0.0625, ZD- 2 = 82.87° 


and so on. Note thatZ)* and D^ n are conjugates, as expected [see Eq, (2 + 63b)J. 

Figure 2.14 shows the frequency spectra (amplitude and angle) of the exponential Fourier 
series for the periodic signal (p(t) in Fig. 2.13b. 

We notice some interesting features of these spectra. First, the spectra exist for positive 
as well as negative values off (the frequency). Second, the amplitude spectrum is an even 
function off and the angle spectrum is an odd function off. Equations (2 + 63) show the 
symmetric characteristics of the amplitude and phase of D fl . 


What Does Negative Frequency Mean? 

The existence of the spectrum at negative frequencies is somewhat disturbing to some people 
because by definition, the frequency (number of repetitions per second) is a positive quantity. 
How do we interpret a negative frequency/o? We can use a trigonometric identity to express 
a sinusoid of a negative frequency -fy by borrowing &>o — 2 jt/o, as 

cos (—coot + 9) = cos (coi)t — 9) 


Figure 2.14 
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This clearly shows that the angular frequency of a sinusoid cos (-a>ot + 0) is |o>o| f which is a 
positive quantity. The commonsense statement that a frequency must be positive comes from 
the traditional notion that frequency is associated with a real-valued sinusoid {such as a sine or 
a cosine). In reality, the concept of frequency for a real-valued sinusoid describes only the rate 
of the sinusoidal variation without addressing the direction of the variation. This is because 
real-valued sinusoidal signals do NOT contain information on the direction of its variation. 

The concept of negative frequency is meaningful only when we are considering complex 
sinusoids for which the rate and the direction of variation are meaningful Observe that 

e ±myt _ cos ojQt _t_ j s j n 0JQt 

This relationship clearly shows that either positive or negative leads to periodic variation of 
the same rate. However, the resulting complex signals are NOT the same. Because |^ =b ^ r I = U 
both and are unit length complex variables that can be shown on the complex 

plane. We illustrate the two exponential sinusoids as unit length complex variables that vary 
with time t in Fig, 2T5. Thus, the rotation rate for both exponentials e ±J(t) ° T is |ojo|. It is clear that 
for positive frequency, the exponential sinusoid rotates counterclockwise while for negative 
frequency, the exponential sinusoid rotates clockwise. This illustrates the actual meaning of 
negative frequency. 

There exists a good analogy between positive/negative frequency and positive/negative 
velocity. Just as people are reluctant to use negative velocity in describing a moving object, 
they are equally unwilling to accept the notion of “negative” frequency. However, once we 
understand that negative velocity simply refers to both the negative direction and the actual 
speed of a moving object, negative velocity makes perfect sense. Likewise, negative frequency 
does NOT describe the rate of periodic variation of a sine or a cosine. It describe the direction 
of rotation of a unit length exponential sinusoid and its rate of revolution. 

Another way of looking at the situation is to say that exponential spectra are a graphical 
representation of coefficients D n as a function of /, Existence of the spectrum at f — -nffi 
merely indicates that an exponential component e~ jnl7l ^ }i exists in the series. We know from 
Euler’s identity 

cos (tot T 0) = — exp (jcot ) H —— exp (— jcot) 

that a sinusoid of frequency me o can be expressed in terms of a pair of exponentials e ^ >r 
and That both sine and cosine consist of positive and negative frequency exponential 

sinusoidal components clearly indicates that we are NOT at all able to describe the direction of 
their periodic variations. Indeed, both sine and cosine functions of frequency o)q consist of two 
equal-size exponential sinusoids of frequency Thus, the frequency of sine or cosine is 
the absolute value of its two component frequencies and denotes only the rate of the sinusoidal 
variations. 


Figure 2.15 
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Example 2.4 Find the exponential Fourier series for the periodic square wave w(t) shown in Fig. 2.16. 


Figure 2.16 

A square pulse 
periodic signal. 



where 


w(f) = D n e> n2 * fo ‘ 


1 f ,1 

Dq = — w(l) dt= - 
To Jto 2 

= ( w(t)e- jn2 * A! dt, nyt 0 

To Jt 0 

= — / e-^^dt 

To J-Tn/4 

= 1 \ e ~jn2nfyT 0 /4 _ <? /™27r/ 0 7b/4] 

—jnlirfoTQ L J 

2 . / n27zfoTo\ 1 . /nn\ 

nljzfoTo \ 4 ) rut V 2 / 

In this case D n is real. Consequently, we can do without the phase or angle plot if we plot 
D n vs./ instead of the amplitude spectrum (|Z>„| vs./) as shown in Fig. 2.17. 


Figure 2.17 

Exponential 
Fourier spectrum 
of the square 
pulse periodic 
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Example 2.5 


Find the exponential Fourier series and sketch the corresponding spectra for the impulse train 
&Tu(t) shown in Fig. 2.18a. 


The exponential Fourier series is given by 

OO | 

*r 0 (0= £ /o = — (2.65) 

rt=-oo 0 

where 

= ~ f h n {t)e~^dt 

T 0 j 7 b 

Choosing the interval of integration (^, and recognizing that over this interval 
<5^(0 = <5(f), were have 


i r T 0/2 

D n = — f me^ ]n2jlht dt 

JO J-Tq/2 


In this integral, the impulse is located at t = 0. From the sampling property of the impulse 
function, the integral on the right-hand side is the value of e~^ n2n ^ t at r = 0 (where the 
impulse is located). Therefore 


and 


D 


n 


1 

% 


^ n— —. fV! 



( 2 . 66 ) 


(2.67) 


Equation (2.67) shows that the exponential spectrum is uniform (D n — l/To) for all the 
frequencies, as shown in Fig. 2.18b. The spectrum, being real, requires only the amplitude 
plot. All phases are zero. 


Figure 2.18 
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ParsevaFs Theorem in the Fourier Series 

A periodic signal g(t) is a power signal, and every term in its Fourier series is also a power 
signal. The power P g of g(r) is equal to the power of its Fourier series. Because the Fourier 
series consists of terms that are mutually orthogonal over one period, the power of the Fourier 
series is equal to the sum of the powers of its Fourier components. This follows from Parse vaTs 
theorem. 

Thus, for the exponential Fourier series 


oo 

*<0 = A) + £ iV ^£Ull, 


the power is given by (see Prob* 2.1-7) 


Pm = £ l D »l 2 

rr=—o o 


For a real g(t ), |Z?_ n | = |DJ. Therefore 


30 

P J? =Z) 0 2 + 2j]|A l | 2 

n= 1 


(2.68a) 


(2.68b) 


Comment: ParsevaPs theorem occurs in many different forms, such as in Eqs. (2.57) and 
Eq. (2.68a). Yet another form is found in the next chapter for nonperiodic signals. Although 
these forms appear to be different, they all state the same principle: that is, the square of the 
length of a vector equals the sum of the squares of its orthogonal components. The first form 
[Eq. (2*57)J applies to energy signals, and the second [Eq* (2.68a)J applies to periodic signals 
represented by the exponential Fourier series. 

Some Other Examples of Orthogonal Signal Sets 

The signal representation by Fourier series shows that signals are vectors in every sense. Just 
as a vector can be represented as a sum of its components in a variety of ways, depending upon 
the choice of a coordinate system, a signal can be represented as a sum of its components in 
a variety of ways. Just as we have vector coordinate systems formed by mutually orthogonal 
vectors (rectangular, cylindrical, spherical, etc.), we also have signal coordinate systems, basis 
signals, formed by a variety of sets of mutually orthogonal signals* There exist a large number 
of orthogonal signal sets that can be used as basis signals for generalized Fourier series. Some 
well-known signal sets are trigonometric (sinusoid) functions, exponential functions, Walsh 
functions, Bessel functions, Legendre polynomials, Laguerre functions, Jacobi polynomials, 
Hermite polynomials, and Chebyshev polynomials. The functions that concern us most in this 
book are the exponential sets discussed next in the chapter. 


2.8 MATLAB EXERCISES 


In this section, we provide some basic MATLAB exercises to illustrate the process of signal 
generation, signal operations, and Fourier series analysis. 
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Basic Signals and Signal Graphing 

Basic functions can be defined by using MATLAB *s m-files. We gave three MATLAB programs 
that implement three basic functions when a time vector t is provided: 

* ustep .m implements the unit step function u(t) 

* rect .m implements the standard rectangular function rect(t) 

* triangl ,m implements standard triangle function A(/) 


% (file name: ustep.m) 

$ The unit step function is a function of time H t'. 

% Usage y = ustep(t) 

% 

% ustep(t) =0 if t < 0 

% ustep(t) = X, if t >- 1 

% 

% t - must be real-valued and can be a vector or a matrix 
% 

function y=usteptt) 

Y = (t> = 0) ; 

end 


% (file name: rect.m) 

% The rectangular function is a function of time 't H * 

% 

% Usage y = rect(t) 

% t - must be real-valued and can be a vector or a matrix 
% 

% rect(t) = 1, if |t| < 0.5 

% rect(t) = 0, if |t| > 0,5 

% 

function y=rect(t) 

y =(sign(t + 0,5)-sign(t-0* 5) >0); 

end 


% {file name: triangl.m) 

% The triangle function is a function of time 't'. 

% 

% triangl(t) = 1-|t|, if |t| < 1 
% triangl(t) = 0, if |t] > 1 

% 

% Usage y = triangl(t) 

% t - must be real-valued and can be a vector or a matrix 
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Figure 2*19 
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function y=triangl(t) 

y = (1-abs(t) } .* £t> = -l) .*(t<l) ; 

end 


We now show how to use MATLAB to generate a simple signal plot through an example, 
siggraf .mis provided. In this example, we construct and plot a signal 

y(t) — exp (—f) sin (6nt)u(t + 1) 

The resulting graph shown in Fig. 2.19. 


% (file name: siggraf.m) 

% To graph a signal, the first step is to determine 

% the x-axis and the y-axis to plot 

% We can first decide the length of x-axis to plot 

t=[-2:0.01:3]; % "t" is from -2 to 3 in 0.01 increment 

% Then evaluate the signal over the range of 11 1" to plot 
y=exp(-1).*sin(10*pi*tj .*ustep(t+1); 

figured); f igl=plot (t, y) ; % plot t vs y in figure 1 

set(figlLinewidth2); % choose a wider line-width 

xlabel('\it t'>; % use italic 't H to label x-axis 

ylabel('\(\bf y\>(\{\it t)) ') ; % use boldface f y' 

to label y-axis 

title('\{\bf y\}\_\{\rm time domain\}')? % can use subscript 


Periodic Signals and Signal Power 

Periodic signals can be generated by first determining the signal values in one period before 
repeating the same signal vector multiple times. 

In the following MATLAB program PfuncEx.m, we generate a periodic signal and 
observe its behavior over 2 M periods. The period of this example is T = 6. The program also 
evaluates the average signal power which is stored as a variable y_jpower and signal energy 
in one period which is stored in variable y_energyT. 
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% (file name: PfuncEx.m) 

% This example generates a periodic signal, plots the signal 
% and evaluates the average signal power in yjpower and signal 
% energy in 1 period T: y_energyT 

echo off;clear;elf; 

% To generate a periodic signal g_T(t) r 

% we can first decide the signal within the period of r T' for g(t) 
Dt=0.002; % Time interval (to sample the signal) 

T=6; % period=T 

M=3; % To generate 2M periods of the signal 

t=[0:Dt:T-Dt]; %"t" goes for one period [0, TJ in Dt increment 

% Then evaluate the signal over the range of M T" 

y=exp(-abs(t >/2)**sin(2*pi*t)* * (ustep(t)-ustep(t-4}); 

% Multiple periods can now be generated* 
time=[]; 
y_periodic=[ ] ; 
for i=-M:M-l, 

time=[time i*T+t]; 
y_periodic=[y_periodic y]; 

end 

figure(1); fy=plot(time H y_periodic); 
set(fy, ' Linewidth',2};xlabel('{\it t}'>; 
echo on 

% Compute average power 

y_power=sum(y_periodic*y_periodic')*Dt/(max(time)-min(time)) 

% Compute signal energy in 1 period T 
y_energyT=sum(y.*conj(y)} *Dt 


The program generates a periodic signal as shown in Fig. 2.20 and numerical answers: 

y_power = 

0 * 0813 

y_energyT = 

0.4878 


Signal Correlation 

The MATLAB program can implement directly the concept of signal correlation introduced 
in Section 2.5. In the next computer example, we provide a program, sign_cor,m, that 
evaluates the signal correlation coefficients between x(r) and signals gi(f), gi{t ), ... g$( f). 
The program first generates Fig. 2.21, which illustrates the six signals in the time domain. 


% (file name: sign_cor.m) 
clear 

% To generate 6 signals x(t), g_1(t), .*♦ g_5(t); 

% of this Example 

% we can first decide the signal within the period of 'T' for gft) 
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subplot(231); sigl=plot(t,x,'k'}; 

xlabel('\it tM; ylabel('{\it x}({\it t)) ') ; % Label axis 

set(sigl,'Linewidth',2); % change linewidth 

axis([-.5 6 -1.2 1*2]}; grid % set plot range 

subplot(232}; sig2=plotft,gl,' k'); 

xlabel{'\it t'J; ylabelf'{\it g}_l({\it t})'); 

set(sig2 , 'Linewidth' r 2}; 

axis([-.5 6 -1.2 1.2]J; grid 

subplot{233}; sig3=plot{t,g2,'k'}; 

xlabel('\it t')? ylabel{'{\it g}_2({\it t)}")? 

set{sig3,'Linewidth',2); 

axis( [- .5 6 -1.2 1.2]); grid 

subplot{234); sig4=plot(t ; g3,'k'}; 

xlabel('\it t'); ylabel('{\it g}_3[{\it t}} ' ) ; 

set{sig4,"Linewidth',2 ); 

axis{[-*5 6 -1.2 1.2]); grid 

subplot(235); sig5=plot(t,g4,' k' ); 

xlabel('\it t r ); ylabel('{\it g}_4({\it t})'); 

set(sig5 H 'Linewidth',2);grid 

axis{[- * 5 6 -1.2 1*21); 

subplot(236); sig6=plot(t,g5, 'k') ; 

xlabel (' \it tM; ylabel('{\it g}_5({\it t}) ') ; 

set(sig6 r 'Linewidth 1 H 2);grid 

axis([-.5 6 -1.2 1,2]}; 

% Computing signal energies 
EG=sum (x. *conj (x} ) *Dt; 

El=sum(gl.*conj(gl))*Dt; 

E2=sum(g2.*conj(g2))*Dt; 

E3=sum(g3.* conj(g3))*Dt; 

E4=sum(g4 * *conj(g4})*Dt; 

E5=sum(g5.*conj(g5})*Dt; 

cO=sum(x.*conj(x)}*Dt/(sqrt{E0*E0}) 
cl=sum(x.*conj(gl}}*Dt/(sqrt(E0*E1)) 
c2=sum(x.*conj(g2}}*Dt/(sqrt(E0*E2)) 
c3=sum(x.*conj(g3))*Dt/(sqrt(E0*E3}) 
c4=sum(x.*conj(g4)}*Dt/(sqrt(E0*E4}) 
c5 = sum(x.*conj fg5))*Dt/{sqrt(E0*E5}) 

The six correlation coefficients are obtained from the program as 

cO = 

1 

cl - 

1 

c2 = 


-1 
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c3 = 

0.9614 

c4 = 

0.6282 

c5 = 

8.6748e-17 


Numerical Computation of Coefficients D n 

There are several ways to numerically compute the Fourier series coefficients D n . We will use 
MATLAB to show how to use numerical integration in the evaluation of Fourier series. 

To carry out a direct numerical integration of Eq. (2.60), the first step is to define the 
symbolic expression of the signal g(t) under analysis. We use the triangle function A (t) in the 
following example. 


% (funct_tri.m) 

% A standard triangle function of base -1 to 1 
function y = funct_tri(t) 

% Usage y = func_tri(t) 

% t - input variable i 
y= ( [t>-l)-(t>l) ) .Ml-abs(t) } ; 


Once the file funct_tri.m defines the function y — g(t), we can directly carry 
out the necessary integration of Eq. (2.60) for a finite number of Fourier series coefficients 
{D n , n = —AT, .... -1,0, 1, ,.., N}. We provide the following MATLAB program called 
FSexample.m to evaluate the Fourier series of A(f/2) with period [ a , b] (a — —2, b = 2). 
In this example, TV = 11 is selected. Executing this short program in MATLAB will generate 
Fig. 2.22 with both amplitude and angle of D n . 


% (file name: FSexp_a.m) 

% This example shows how to numerically evaluate 
% the exponential Fourier series coefficients Dn 
% directly. 

% The user needs to define a symbolic function 
% g{t). In this example, g(t)=funct_tri(t). 
echo off; clear; elf; 

j=sqrt(-l); % Define j for complex algebra 
b=2; a="2; % Determine one signal period 

tol=l.e-5; % Set integration error tolerance 
T=b-a; % length of the period 

N=ll; % Number of FS coefficients 

% on each side of zero frequency 
Fi=[ -N:N] *2*pi/T; % Set frequency range 
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Figure 2,22 
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% now calculate D_0 and store it in D[N+1); 

Func= @{t) funct_tri(t/2); 

D(N+l)=1/T*quad(Func,a,b,tol); % Using quad.nt integration 

for i=l:N 

% Calculate Dn for n=l,.. . t N (stored in D(N+2) ... D[2W+1) 

Func = @(t) exp(-j*2*pi*t*i/T).*funct_tri(t/2); 

D(i+N+1)=quad(Func,a,b,tol); 

% Calculate Dn for n=-N,...,-l (stored in D{1) . D[W) 

Func= @(t) exp{j*2*pi*t*(N+l-i)/T).*func_tri (t/2) ; 

D(i> = quad(Func,a,b,tol); 

end 

figure(1); 

subplot(211);sl=stem{[-W:N],abs(D)); 

set(si,'Linewidth',2); ylabel('|{\it D}_{\it n}j'); 



54 SIGNALS AND SIGNAL SPACE 


title('Amplitude of {\it D}_{\it n} ' } 

subplot { 212);s2 = stem([-N:N] H angle[D] ); 

set{s2,'Linewidth',2); ylabel( ' <{\it D}_{\it n} ' ) ; 

title('Angle of {\it D}_{\it n} ' ) ; 
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PROBLEMS 


2*1-1 Find the energies of the signals shown in Fig. P2.1-1. Comment on the effect on energy of sign 
change, time shift, or doubling of the signal What is the effect on the energy if the signal is 
multiplied by k? 




2*1-2 (a) Find E x and E y , the energies of the signals jt(r) and y(r) shown in Fig. P2 + l-2a, Sketch the 
signals *(r) 4- >(0 and x(/) - y(f) and show that the energy of either of these two signals is 
equal to E x + E y , Repeat the procedure for signal pair in Fig. P2.1-2b. 

(b) Now repeat the procedure for signal pair in Fig, P2.1-2c. Are the energies of the signals 
*(f) +y(0 and x(t) — y(?) identical in this case? 

2*1-3 Find the power of a sinusoid C cos (^f + 8 ). 

2*1-4 Show that if a>] = the power of g(0 = C\ cos + 8 \) + Ci cos{(U 3 r + Of) is [C [ 2 + 
C 2 2 -4* 2 C 1 C 2 cos(^i - 82 )]f 2? which is not equal to (C 1 2 + C2 2 )/2> 

2*1-5 Find the power of the periodic signal g(t) shown in Fig. P2.1 -5. Find also the powers and the rms 
values of (a) - g(t ) (b) 2g{f) (c) cg(t). Comment. 

2*1-6 Find the power and the rms value for the signals in (a) Fig, P2-l-6a; (b) Fig. 2.16; (c) Fig. P2-l-6b; 
(d) Fig, P2,7-4a; (e) Fig. P2.7-4c. 
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Figure P.2.3-1 


Figure P.2.3-2 


2.1- 8 Determine the power and the rms value for each of the following signals: 

(a) 10 cos ^100; + (d) 10 cos 5f cos 1 Or 

(b) 10 cos ^100; 4- 4- 16 sin ^150; 4- ^ (e) 10 sin 5/ cos 10; 

(c) (10 4- 2 sin 30 cos 10; (f) cos c^t 

2.2- 1 Show that an exponential e~ ar starting at —oo is neither an energy nor a power signal for any 

real value of a. However, if a is imaginary, it is a power signal with power P R = 1 regardless of 
the value of a. 

2*3-1 In Fig. P23-1, the signal g\(t) - g(-t). Express signals g 2 (t), g^(t), g 4 (/), and g 5 (t) in terms 
of signals g(f), g\ (f), and their time-shifted, time-scaled, or time-inverted versions. For instance, 
£2(0 = g(f — T) 4 g] (f - T) for some suitable value of T. Similarly, both £ 3(0 and g 4 (f) can be 
expressed as g(t — T) + g{t — T) for some suitable value of T. In addition, £5 (f) can be expressed 
as £(;) time-shifted, time-scaled, and then multiplied by a constant. 



2.3-2 For the signal g(f) shown in Fig* P2*3-2, sketch the following signals: (a) g(-f)i (b) git + 6 ); 

( c )£( 3 />; (d) £(6 - r ). 



2*3-3 For the signal g(;) shown in Fig. P23-3, sketch (a) g(t - 4); (b) g(;/L5); (c) g(2i - 4); (d) 
£(2 - 0 - 

Hint: Recall that replacing ; with t - T delays the signal by T. Thus, £(2; - 4) is £(2;) with r 
replaced by ; — 2. Similarly, g(2 — t) is £(-;) with t replaced by ; — 2. 

2*3-4 For an energy signal g(f) with energy E gy show that the energy of any one of the signals 
—git), and g(t — T) is E g , Show also that the energy of g(at) as well as g(at — b) 

is Eg fa. This shows that time inversion and time shifting do not affect signal energy. On the other 
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Figure P.2.3-3 



hand, time compression of a signal by a factor a reduces the energy by the factor a. What is the 
effect on signal energy if the signal is (a) time-expanded by a factor a {a > l) and (b) multiplied 
by a constant al 


2.3-5 Simplify the following expressions: 


(a) 

G“7,h> 

(d) 

(b) 

Cr + >> 

(e) 

(0 

[e~? cos <3r - tt/3)] S(t + n) 

(f) 


( sin ko A „ 

—j 4< "> 


Hint: Use Eq, (2.10b). For part (f) use L* Hospital's rule. 


2.3-6 Evaluate the following integrals: 


(a) - r)dr 

(b) - r)dr 

(c) me'*# dt 

(d) $U - 2) sin Jitdt 


(e) f^&Q + Oe-'dt 

(f) /3 2 0 3 + 4)5(1 -Odt 

(g) /^£(2-0<S(3-r><* 

(h) e^- ' > cos § (x - 5)3 (2x - 3) dx 


Hint: 5(x) is located at jc = 0. For example, 5(1 - r) is located at l — t — 0; that is, at t = 1, and 
so on. 


2,3-7 Prove that 

S(at) = 2-3(0 

\a\ 

Hence show that 

5 (A = ~5if) where a? = 2 xf 
2 tt 

Hint: Show that 

f 00 ms<‘u)* = 2 -mo) 

J-0 0 |fl| 


2.4-1 Derive Eq, (2.19) in an alternate way by observing that e= (g—c\), and 


|e| 2 = (g - cx) ■ (g - cx) = Igl 2 + c 2 |xj 2 - 2cg‘ x 


To minimize |e| 2 , equate its derivative with respect to c to zero. 


2.4-2 For the signals g(t) and x(t) shown in Fig* P2.4-2, find the component of the form jc(f) contained 
in g(j). In other words, find the optimum value of c in the approximation g(t) ** exit) so that the 
error signal energy is minimum. What is the resulting error signal energy ? 
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2.4- 3 For the signals #0) and x(f) shown in Fig. P2.4-2, find the component of the form g(t) contained 

in .*(0- In other words, find the optimum value of c in the approximation jc(f) ^ cg(t) so that the 
error signal energy is minimum. What is the resulting error signal energy? 

2.4- 4 Repeat Prob. 2.4-2 ifx(j) is a sinusoid pulse shown in Fig. P2.4-4. 


Figure P, 2,4-4 

1 

xir) 

V < ~\ sin 2nt 


W 1 ^ 


2,4-5 The Energies of the two energy signals x(t) and y(f) are E x and E y , respectively. 

(a) If x(t) andy(r) are orthogonal, then show that the energy of the signal x(t) +>■(/) is identical 
to the energy of the signal x(f) — y(0* and is given by E x -f- Ey, 

(b) ifx(0 andy(f) are orthogonal, find the energies of signals c\x(t)-\-c 2 y (0 and cjxO) -c 2 y(t). 

(c) We define E xy , the cross-energy of the two energy signals x(t) and y(r), as 

/ oo 

x(t)y*(t)dt 

-oo 

If z{t) = x(0 ± y(0, then show that 

^7. = Ey =L [Exy + Eyx) 

2,4-6 Let xi(0 and jc 2 (r) be two unit energy signals orthogonal over an interval from t - t\ to t 2 . 
Signals X[{t) and jc 2 (^) are unit energy, orthogonal signals; we can represent them by two unit 
length, orthogonal vectors (xi, X 2 ). Consider a signal g(r) where 

g(t) = cix { (t) + c 2 x 2 (t) h <t <t 2 

This signal can be represented as a vector g by a point (c\, c 2 ) in thejq - x 2 plane. 

(a) Determine the vector representation of the following six signals in this tw o-dimensional 
vector space: 

(i) £i(0 = (iv) g 4 (t) - xi(t) + 2x 2 (0 

(«) £2 (0 = “JCi(0 + 2 x 2(0 (v) g 5 (f) = 2x[(t) + x 2 (0 

(iii) g 3(0 - -x 2 (t) (vi) g 6 (t) - 3xj(r) 

(b) Point out pairs of mutually orthogonal vectors among these six vectors. Verify that the pairs 
of signals corresponding to these orthogonal vectors are also orthogonal 
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Figure P.2.5-1 


2*5-1 Find the correlation coefficient c n of signal x(f) and each of the four pulses g\ (t), £2(0, £3<0>and 
£4(0 shown in Fig, P2.5-1. To provide maximum margin against the noise along the transmission 
path, which pair of pulses would you select for a binary communication? 





0.707 


(d) 


0 



2.7-1 (a) Sketch the signal g{/) = t 2 and find the exponential Fourier series to represent g(t) over the 
interval (—1, 1), Sketch the Fourier series <p(t) for all values of t. 

(b) Verify Parsevafs theorem [Eq. (2.68a)] for this case, given that 


E l 71 

^4 " oi 


n=l 


90 


2.7-2 (a) Sketch the signal g(t) = t and find the exponential Fourier series to represent g(f) over the 
interval (-t, it). Sketch the Fourier series <p{t) for all values of t. 

(b) Verify ParsevaTs theorem [Eq. (2.68a)] for this case, given that 


£ 




71 


2 


6 


2.7-3 If a periodic signal satisfies certain symmetry conditions, the evaluation of the Fourier series 
coefficients is somewhat simplified. 

(a) Show that if g{t) = g(-t) (even symmetry), then the coefficients of the exponential Fourier 
series are real. 

(b) Show that if g{t) = - g(—t ) (odd symmetry), the coefficients of the exponential Fourier 
series are imaginary, 

(c) Show that in each case, the Fourier coefficients can be evaluated by integrating the periodic 
signal over the half-cycle only. This is because the entire information of one cycle is implicit 
in a half-cycle owing to symmetry. 

Hint: If g e (0 and £^(0 are even and odd functions, respectively, of /, then (assuming no impulse 
or its derivative at the origin), 

/ a c2a ra 

g e {t)dt= j 3e(t)dt and ) Sn(t)dt - 0 

-a J 0 J—a 
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Also, the product of an even and an odd function is an odd function, the product of two odd 
functions is an even function, and the product of two even functions is an even function, 

2-7-4 For each of the periodic signals shown in Fig, P2,7-4, find the exponential Fourier series and 
sketch the amplitude and phase spectra. Note any symmetric property. 








Problems 61 


(b) Determine the odd and even components of the following functions: (i) «(0; (ii) e “'«(*); 
(iii) ei*. 

2.7-6 (a) If the two halves of one period of a periodic signal are of identical shape except that one is the 
negative of the other, the periodic signal is said to have a half-wave symmetry. If a periodic 
signal g(r) with a period 7b satisfies the half-wave symmetry condition, then 

In this case, show that all the even-numbered harmonics (coefficients) vanish. 

(b) Use this result to find the Fourier series for the periodic signals in Fig t P2/7-6. 



2*8-1 A periodic signal g(t) is expressed by the following Fourier series: 

g(t) — 3sin t + cos 4- 2cos (8f 4- j) 


(a) By applying Euler's identities on the signal g(r) directly, write the exponential Fourier series 
for g(t). 

(b) By applying Euler's identities on the signal g (t) directly, sketch the exponential Fourier series 
spectra. 



Q ANALYSIS AND 
O TRANSMISSION OF 
SIGNALS 


E lectrical engineers instinctively think of signals in terms of their frequency spectra 
and think of systems in terms of their frequency responses* Even teenagers know about 
audio signals having a bandwidth of 20 kHz and good-quality loud speakers responding 
up to 20 kHz. This is basically thinking in the frequency domain. In the last chapter we 
discussed spectral representation of periodic signals (Fourier series). In this chapter we extend 
this spectral representation to aperiodic signals. 


3.1 APERIODIC SIGNAL REPRESENTATION BY 
FOURIER INTEGRAL 


Applying a limiting process, we now show that an aperiodic signal can be expressed as a 
continuous sum (integral) of everlasting exponentials. To represent an aperiodic signal g(t) 
such as the one shown in Fig, 3,1a by everlasting exponential signals, let us construct a 
new periodic signal gr 0 (0 formed by repeating the signal g(t) every To seconds, as shown in 
Fig. 3.1b. The period 7o is made long enough to avoid overlap between the repeating pulses. The 
periodic signal gr 0 (f) can be represented by an exponential Fourier series. If we let To oo, 
the pulses in the periodic signal repeat after an infinite interval, and therefore 

lim gT 0 {t)=g(t) 

T{)^o o 


Thus, the Fourier series representing gT$(0 will also represent g(r) in the limit To —> oo. 
The exponential Fourier series for g rfl (f) is given by 

gr 0 (O= J2 (3-D 

n=—oo 

in which 

D„ = — / g To {t)e-^Ut (3.2a) 

'0 J-Tt)J2 


62 



3.1 Aperiodic Signal Represen fa ti on by Fourier Integral 63 



and 

m = ip-= 2jr/ 0 (3-2b) 

To 

Observe that integrating gj^f) over (— Tq/2,Tq/2) is the same as integrating g(t) over 
(—oo, oo). Therefore, Eq. (3.2a) can be expressed as 


t 

D n = ~ g(t)e~ jnao ‘ dt 

* 0 i/—oo 
1 fOO 

= T / g(t)e-^dt 

T() J -oo 


(3.2c) 


It is interesting to see how the nature of the spectrum changes as T$ increases, To understand 
this fascinating behavior, let us define Gif), a continuous function of a>, as 

G(f)= r 

J — OQ 

/ OO 

g(t)e~j 27T fi dt 

-oo 

in which co = 2^/. A glance at Eqs. (3.2c) and (3.3) shows that 

D n = ^-C(«/ 0 ) (3.5) 

To 

This in turn shows that the Fourier coefficients are (l/7b times) the samples of Gif) 
uniformly spaced at intervals of/o Hz, as shown in Fig. 3.2a,* 

Therefore, 0/Tf)G(f) is the envelope for the coefficients D n . We now let 7 q —► oo 
by doubling 7b repeatedly- Doubling To halves the fundamental frequency /o, so that there 
are now twice as many components (samples) in the spectrum. However, by doubling 7b, the 
envelope (l/7b)G(f) is halved, as shown in Fig. 3.2b. If we continue this process of doubling To 
repeatedly, the spectrum progressively becomes denser while its magnitude becomes smaller. 


(3.3) 

(3-4) 


* For the sake of simplicity we assume D n and therefore G{f) in Fig. 3.2 to be real The argument, however, is also 
valid for complex D n [orG(/)], 
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Note, however, that the relative shape of the envelope remains the same [proportional to Gif ) 
in Eq. (3,3)], In the limit as To -> oo,/o 0 and D n 0. This means that the spectrum is 
so dense that the spectral components are spaced at zero (infinitesimal) interval. At the same 
time, the amplitude of each component is zero (infinitesimal). We have nothing of everything, 
yet we have something! This sounds like Alice in Wonderland , but as we shall see, these are 
the classic characteristics of a very familiar phenomenon* 

Substitution of Eq, (3.5) in Eq. (3,1) yields 

>*,«) = £ (3.6) 

/I = -oc- 0 

As 7 q oo ,/q = l/7b becomes infinitesimal (/b —> 0), Because of this, we shall replace/o 
by a more appropriate notation, A/, In terms of this new notation, Eq. (3.2b) becomes 


and Eq, (3.6) becomes 


00 

*r 0 (O= £ [GinAfWie^W (3.7a) 


Equation (3.7a) shows that g 7 0 (r) can be expressed as a sum of everlasting exponentials of 
frequencies 0, ±A/, ±2A/, ±3A f y « ♦ > (the Fourier series). The amount of the component 
of frequency nAf is [ G(nAf)Af ]. In the limit as To “■> oo, A/ Oandg^iO g(t ). 
Therefore, 

00 

g(t)= lim g To (t)= lim T G(nA/)^ 2 ™ A ' )( A/ (3.7b) 

To ^oo A/-> 0 ^ 

n=- (x> 

The sum on the right-hand side of Eq. (3.7b) can be viewed as the area under the function 
G{f )ei 27l ft, as shown in Fig. 3.3. Therefore, 


+ You may consider this as an irrefutable proof of the proposition that 0% ownership of everything is better than 
100% ownership of nothing! 
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The integral on the right-hand side is called the Fourier integral. We have now succeeded 
in representing an aperiodic signal g(t) by a Fourier integral* (rather than a Fourier series). 
This integral is basically a Fourier series (in the limit) with fundamental frequency Af -v 0, as 
seen from Eq. (3.7b). The amount of the exponential ^ 2nn ^ 1 is GinAf)Af. Thus, the function 
G(f) given by Eq. (3.3) acts as a spectral function. 

We call Gif ) the direct Fourier transform of git), and g(t) the inverse Fourier transform 
of Gif). The same information is conveyed by the statement that g(t) and Gif) are a Fourier 
transform pain Symbolically, this is expressed as 

Gif) = F\g(t)] and git) = [G(01 


or 


git) <=► Gif) 


To recapitulate, 


fOO 


and 

c(o = 

/ g(t)e~ ieil) dt 

f-oo 

fOQ 

(3.9a) 

where co — 2izf\ 

sm = ) 

i 

8 

O 

£ 

(3.9b) 


It is helpful to keep in mind that the Fourier integral in Eq. (3.9b) is of the nature of a 
Fourier series with fundamental frequency A f approaching zero [Eq. (3,7b)] + Therefore, most 
of the discussion and properties of Fourier series apply to the Fourier transform as well. We 
can plot the spectrum Gif) as a function of/. Since Gif) is complex, we have both amplitude 
and angle (or phase) spectra: 


G<f) = \G(f)\eiW ] 

in which | G(/) j is the amplitude and & g (f) is the angle (or phase) of Gif). From Eq. (3.9a), 

/ OC 

git )^ dt 

-00 


* This should not be considered as a rigorous proof of Eq. (3.8). The situation is not as simple as we have made it 
appear. 1 
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/ versus <o 

Traditionally, we often use two equivalent notations of angular frequency co and frequency 
/ interchangeably in representing signals in the frequency domain. There is no conceptual 
difference between the use of angular frequency a> (in unit of radians per second) and frequency 
/ (in units of hertz, Hz). Because of their direct relationship, we can simply substitute = Tnf 
into Gif) to arrive at the Fourier transform relationship in the ^-domain: 


•F [*<*)] = 



g(t)e Jm ‘dt 


(3.10) 


Because of the additional 2 tt factor in the variable co used by Eq. (3.10), the inverse transform 
as a function of a> requires an extra division by 2 jt. Therefore, the notation of / is slightly 
favored in practice when one is writing Fourier transforms. For this reason, we shall, for the 
most part, denote the Fourier transform of signals as functions of Gif) in this book. On the 
other hand, the notation of angular frequency a> can also offer some convenience in dealing 
with sinusoids. Thus, in later chapters, whenever it is convenient and nonconfusing, we shall 
use the two equivalent notations interchangeably. 


Conjugate Symmetry Property 

From Eq. (3.9a), it follows that if g (t) is a real function of t, then Gif) and G(—f) are complex 
conjugates, that is,* 


Therefore, 


G(-f) = G*(f) 

(3.11) 

|G(-/)| = |G(OI 

(3.12a) 

M-/) = -W 

(3.12b) 


Thus, for real g{t), the amplitude spectrum jG(/) [ is an even function, and the phase spectrum 
0 g (f) is an odd function off. This property (the conjugate symmetry property) is valid only 
for real g (t). These results were derived for the Fourier spectrum of a periodic signal in Chapter 
2 and should come as no surprise. The transform Gif ) is the frequency domain specification 
ofg(t). 


Example 3.1 Find the Fourier transform of e at u{t ). 

By definition [Eq. (3.9a)], 

/ OO fO O 1 -30 

e~ at uit)e-^ l dt= / e - (a + J2nf)t dt = - 

-oo Jo 0 

But = 1. Therefore, as t oo, e -(«+^/)r = e -at e -j2nft = o if « > 0. 

Therefore, 

Gif) = -2_ a > 0 (3.13a) 

a -f jco 


Hermitian symmetry is the term used to describe complex functions that satisfy Eq. (3.11) 



Figure 3.4 

and its 
Fourier spectra. 
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where co = 2jt/« Expressing a 4* jco in the polar form as Va 2 H- o) 2 ^ tan_1( a \ we obtain 

1 


G(j) = 


-j tan 


s/a 2 + (Inf) 2 


(3.13ba) 


Therefore, 



l (Txf\ 


and &gff) = “tan —j 



Observe that \G(f)\ is an even function of/, and^(f) is an odd function of/, as expected. 


Existence of the Fourier Transform 

In Example 3.1 we observed that when a < 0, the Fourier integral for e~ af u{t) does not 
converge. Hence, the Fourier transform for e~ al u{t) does not exist if a < 0 (growing exponen¬ 
tially). Clearly, not all signals are Fourier transformable. The existence of the Fourier transform 
is assured for any g(t) satisfying the Diriehlet conditions, the first of which is* 

/ 0G 

\g(t)\dt < oo (3.14) 

’GO 


To show this, recall that \e = 1. Hence, from Eq. (3.9a) we obtain 



l£(OI* 


This shows that the existence of the Fourier transform is assured if condition (3.14) is satisfied. 
Otherwise, there is no guarantee. We have seen in Example 3.1 that for an exponentially growing 
signal (which violates this condition) the Fourier transform does not exist. Although this 
condition is sufficient, it is not necessary for the existence of the Fourier transform of a signal. 


* The remaining Diriehlet conditions are as follows: In any finite interval, g(f) may have only a finite number of 
maxima and minima and a finite number of finite discontinuities. When these conditions are satisfied, the Fourier 
integral on the right-hand side of Eq. (3.9b) converges to g(r) at all points where g(r) is continuous and converges to 
the average of the right-hand and left-hand limits of g(r) at points where g(t) is discontinuous. 
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For example, the signal (sin at) ft, violates condition (3,14), but does have a Fourier transform. 
Any signal that can be generated in practice satisfies the Dirichlet conditions and therefore has 
a Fourier transform. Thus, the physical existence of a signal is a sufficient condition for the 
existence of its transform. 

Linearity of the Fourier Transform (Superposition Theorem) 

The Fourier transform is linear; that is, if 

Si (0 Gi(0 and £2(0 Glif) 

then for all constants a\ and a 2 , we have 

aigi(t) +tf2#2(0 a\G\{f) +a2G2<f) (3.15) 

The proof is simple and follows directly from Eq* (3.9a)* This theorem simply states that 
linear combinations of signals in the time domain correspond to linear combinations of their 
Fourier transforms in the frequency domain. This result can be extended to any finite number 
of terms as 


^ ^ &kSk (f) > ^ ^ 

k k 

for any constants {^} and signals {gt(0}- 

Physical Appreciation of the Fourier Transform 

To understand any aspect of the Fourier transform, we should remember that Fourier repre¬ 
sentation is a way of expressing a signal in terms of everlasting sinusoids, or exponentials* 
The Fourier spectrum of a signal indicates the relative amplitudes and phases of the sinu¬ 
soids that are required to synthesize that signal* Aperiodic signal's Fourier spectrum has finite 
amplitudes and exists at discrete frequencies (f and its multiples). Such a spectrum is easy 
to visualize, but the spectrum of an aperiodic signal is not easy to visualize because it has a 
continuous spectrum that exists at every frequency. The continuous spectrum concept can be 
appreciated by considering an analogous, more tangible phenomenon. One familiar example 
of a continuous distribution is the loading of a beam. Consider a beam loaded with weights 
Di,Z> 2 ,D 3 , ... y D n units at the uniformly spaced points xj,x 2 ,... ,x n , as shown in Fig. 3.5a. 
The total load Wj on the beam is given by the sum of these loads at each of the n points: 

n 

Wt = £a 
/=1 

Consider now the case of a continuously loaded beam, as shown in Fig. 3.5b. In this case, 
although there appears to be a load at every point, the load at any one point is zero. This does 

Figure 3.5 

Analogy for 
Fourier 
transform. 
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not mean that there is no load on the beam. A meaningful measure of load in this situation is 
not the load at a point, but rather the loading density per unit length at that point* Let G(x) 
be the loading density per unit length of beam* This means that the load over a beam length 
Ax (Ax -> 0) at some point * is G(x) Ax, To find the total load on the beam, we divide the 
beam into segments of interval Ax (Ax 0)* The load over the nth such segment of length 
Ax is [G(rtAx>] Ax* The total load Wj is given by 


-*r? 

Wt — lim Y Gin Ax) Ax 

Ax—*0 

*1 

fx n 

= / G(x)dx 

J X] 

In the case of discrete loading (Fig. 3.5a), the load exists only at the n discrete points. At other 
points there is no load* On the other hand, in the continuously loaded case, the load exists at 
every point, but at any specific point x the load is zero. The load over a small interval Ax, 
however, is [G(rcAx)] Ax (Fig. 3.5b). Thus, even though the load at a point x is zero, the 
relative load at that point is G(x). 

An exactly analogous situation exists in the case of a signal spectrum. When g(t) is 
periodic, the spectrum is discrete, and g (/) can be expressed as a sum of discrete exponentials 
with finite amplitudes: 


g(r) = Y, D ^ l7Tn - kt 

n 


For an aperiodic signal, the spectrum becomes continuous; that is, the spectrum exists for 
every value of/, but the amplitude of each component in the spectrum is zero* The meaningful 
measure here is not the amplitude of a component of some frequency but the spectral density 
per unit bandwidth. From Eq. (3*7b) it is clear that g(t) is synthesized by adding exponentials 
of the form ^' 2jr «A// in which the contribution by any one exponential component is zero. But 
the contribution by exponentials in an infinitesimal band Af located at/ = «A/isG(«A/)A/, 
and the addition of all these components yields g(t) in the integral form: 


00 

g(t) = lim y G(nAf)e (jnl7lf)t Af - 
Af —*-0 


/ oo 

G{f)fi 27lft df 

-OO 


The contribution by components within the band df is Gif) df , in which df is the bandwidth 
in hertz. Clearly Gif) is the spectral density per unit bandwidth (in hertz). This also means 
that even if the amplitude of any one component is zero, the relative amount of a component 
of frequency/ is G(f). Although Gif) is a spectral density, in practice it is customarily called 
the spectrum of g{£) rather than the spectral density of g(r)« Deferring to this convention, we 
shall call Gif) the Fourier spectrum (or Fourier transform) of g(t). 


3.2 TRANSFORMS OF SOME USEFUL FUNCTIONS 


For convenience, we now introduce a compact notation for some useful functions such as 
rectangular, triangular, and interpolation functions. 
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Figure 3.6 

Rectangular 

pulse. 




(a) 


(b) 




Unit Rectangular Function 

We use the pictorial notation n (x) for a rectangular pulse of unit height and unit width, centered 
at the origin, as shown in Fig* 3,6a; 


n(x) = 


1 M < i 

0.5 \x\ = \ 

0 \x\ > £ 


(3.16) 


Notice that the rectangular pulse in Fig. 3.6b is the unit rectangular pulse fl{x) expanded 
by a factor t and therefore can be expressed as 11 (x/t). Observe that the denominator r in 
U(x/z) indicates the width of the pulse. 


Unit Triangular Function 

We use the pictorial notation A(x) for a triangular pulse of unit height and unit width, centered 
at the origin, as shown in Fig* 3.7a: 


AW = 


l-2|r| \x \< ] 2 

0 l*| > i 


(3.17) 


Observe that the pulse in Fig. 3.7b is A(x/r). Observe that here, as for the rectangular pulse, 
the denominator r in A(x/z) indicates the pulse width. 


Sine Function sine(.r) 

The function sin x/x is the “sine over argument” function denoted by sine Ui. 


* sine (x) is also denoted by Sa (x) in the literature. Some authors define sine (x) as 

, . sin 7ix 
sine (x) = - 

JIX 
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This function plays an important role in signal processing. We define 

sin x 

sine (x) - (3.18) 

Inspection of Eq. (3.18) shows that 

1. sine (jt) is an even function of x. 

2. sine (jt) = 0 when sin x = 0 except at x — 0, where it is indeterminate. This means that 
sine (x) = 0 for t = ±jr,±2jr, ±3 jt,. ... 

3. Using L’HospitaTs rule, we find sinc(0) = l. 

4. sine (*) is the product of an oscillating signal sin x (of period 2;r) and a monotonically 
decreasing function l/x . Therefore, sine (x) exhibits sinusoidal oscillations of period 2 tt, 
with amplitude decreasing continuously as l/x. 

5. In summary, sine (x) is an even oscillating function with decreasing amplitude. It has a unit 
peak at x = 0 and zero crossings at integer multiples of jt. 

Figure 3.8a shows sine (a:). Observe that sine (jc) = 0 for values of x that are positive and 
negative integral multiples of jt . Figure 3.8b shows sine (3 g>/ 7). The argument 3a>/7 = jt 
when ai — Itt j3 or/ =7/6. Therefore, the first zero of this function occurs at w = 7tt/3 
(f = 7/6). 
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Example 3.2 Find the Fourier transform of g(t) = n{r/r) (Fig. 3,9a). 


Figure 3+9 

Rectangular 
pulse and its 
Fourier spectrum. 
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We have 

Gif) = j n e~i 2 * fi dt 

Since n(f/r) = 1 for |r| < r/2, and since it is zero for |r| > r/2, 


G(f) 


rr/2 

J-t/2 


-jlxfi dt 


r/2 

I 


-- ( e ~ jnfz -e^ 1 ) 


2 sin (7rfz) 


2nf 


sin(jr/r) 

t——— = t sine (nfx) 
(tt/x) 


Therefore, 

n x sine — t sine (nf x) (3.19) 

Recall that sine (x) = 0 when x = ±rm. Hence, sine (ojt/ 2) = 0 when cox/2 — dr/?7r; 
that is, when/ = ±n/r (n = 1,2,3,.,,), as shown in Fig. 3.9b, Observe that in this case 
G(f) happens to be real. Hence, we may convey the spectral information by a single plot 
of G(f") shown in Fig. 3.9b. 


Example 3.3 Find the Fourier transform of the unit impulse signal £(f). 

We use the sampling property of the impulse function [Eq. (2,11)] to obtain 


nm) 


/ oo 

S(t)e~ j:b!fi dt = e~ J2!Tf - Q = 1 

-00 


(3.20a) 


or 


5(0 <=> 1 


(3.20b) 
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Figure 3*10 

Unit impulse and 
its Fourier 
spectrum. 
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Figure 3*10 shows 5(0 and its spectrum- 


m = s« 


a 

G(/)-1 

l 


To \ 

- v- 


[0 f + 


(a) (b) 


Example 3.4 Find the inverse Fourier transform of <5(2 nf) = 


Figure 3*11 

Constant (dc] 
signal and its 
Fourier spectrum. 


From Eq. (3.9b) and the sampling property of the impulse function, 


F~ ] [S(2Kf)] 


/ no I floo 

S(lTcf)e> 2 ^ df = — / d(27Tf) 

-no J - CO 

_L . e -)2xf-0 _ J_ 

2tt 2jt 


Therefore, 

-L 5(2^/) (3.21a) 

2tt 
or 

1 ^=>S(f) (3.21b) 

This shows that the spectrum of a constant signal g (0 = 1 is an impulse <5(/) = 2 jt6(2 tt/), 
as shown in Fig. 3.11. 


1 

^ \ 

C(/) = 5(/) 



0 

o 



(a) (b) 

The result [Eq. (3 + 21b)] also could have been anticipated on qualitative grounds. Recall 
that the Fourier transform of g(f) is a spectral representation of g(t) in terms of everlasting 
exponential components of the form Now to represent a constant signal gO) = 1* 
we need a single everlasting exponential e^ jT ^ with / = 0* This results in a spectrum at a 
single frequency/ = 0 + We could also say that g{t) — 1 is a dc signal that has a single 
frequency component at/ = 0 (dc). 
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If an impulse at/ = 0 is a spectrum of a dc signal, what does an impulse at/ = /o 
represent? We shall answer this question in the next example. 


Example 3.5 Find the inverse Fourier transform of S(f -fo). 


We the sampling property of the impulse function to obtain 

/ oo 

-0C 

Therefore, 

e i2xf 0 t ^ S(f _ /o) (3.22a) 

This result shows that the spectrum of an everlasting exponential is a single impulse 
at/ = /}. We reach the same conclusion by qualitative reasoning. To represent the ever¬ 
lasting exponential we need a single everlasting exponential with &> = Info- 

Therefore, the spectrum consists of a single component at frequency/ =/o- 
From Eq. (3.22a) it follows that 


N 


e -}2*fo< ^ S( f +/q) 


(3.22b) 


Example 3.6 Find the Fourier transforms of the everlasting sinusoid cos 2nfy. 


Recall the Euler formula 


cos 2 tt/q t = + e 

Adding Eqs. (3.22a) and (3.22b), and using the preceding formula, we obtain 

cos liTfyt <=> +/o) + -fo)] (3.23) 

The spectrum of cos 2xfot consists of two impulses at/d and -fo in the/-domain, or, 
two impulses at ±a*> = ±2 tt/o in the tu-domain as shown in Fig. 3.12. The result also 
follows from qualitative reasoning. An everlasting sinusoid cos can be synthesized by 
two everlasting exponentials, and Therefore, the Fourier spectrum consists 

of only two components of frequencies and -(Oq+ 


Figure 3.12 

Cosine signal 
and its Fourier 
spectrum. 
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Example 3.7 Find the Fourier transform of the sign function sgn(f) (pronounced signum r), shown in 
Fig. 3.13. its value is +1 or — 1, depending on whether t is positive or negative: 

1 t > 0 

sgn(f) = 0 1 = 0 (3.24) 

-1 t < 0 


We cannot use integration to find the transform of sgn (r) directly. This is because sgn (f) 
violates the Dirichlet condition [see E.g. (3.14) and the associated footnote]. Specifically, 
sgn (t) is not absolutely integrate. However, the transform can be obtained by considering 
sgnr as a sum of two exponentials, as shown in Fig. 3.13, in the limit as a 0: 

sgnf = lim \e~ ot u{t) — e ai u{— f)1 
0 



Therefore, 


J 7 (sgn(j)] = lim [jf[e at u(t) ] - 


= lim (-——--—- | (see pairs 1 and 2 in Table 3.1) 

3^0 \ a -\-j2irf a -jlnf ) 

= lim (. -W ) = ± ( 3.: 

a^O \(J 2 +4 7Z 1 / 1 } }7tf 


3.3 SOME PROPERTIES OF THE 
FOURIER TRANSFORM 

We now study some of the important properties of the Fourier transform and their implications 
as well as their applications. Before embarking on this study, it is important to point out a 
pervasive aspect of the Fourier transform—the time-frequency duality. 
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TABLE 3*1 

Short Table of Fourier Transforms 


git) Gif ) 


1 e ar u(t) 

a +./2.T/ 

2 e af u(—t) 

1 



C LI \ - L J 

a -J2xf 

3 

e ~*U\ 

2 a 


a 2 + (2ji/) 2 

1 

4 

te~ at u{t) 


(a +j2jif) 2 

5 

t n e~ at u(i) 

n\ 

(a +;2t/)«+1 

6 

m 

1 

7 

1 

m 

8 


W -fo) 

9 

COS 27T/o* 

0.5 lS(f+fy) + S(f-fo)] 

10 

sin 2 

J'0.5 [a(r+A)-«cr-A)] 

11 

uit) 


12 


2 

sgn t 

jlxf 

13 

cos livfot u(t) 


14 

sin 2irfQt u(t) 

2[g(/ -fo) - S(f +/o)] + ,, 



4/ (2tt/o) 2 - (2 nfY 

15 

e~ al sin 2 j xfyt u{t) 

2?r/o 

(a +j2nf) 2 +■ 4-rt 2 f 2 


16 

e~ at cos lizfct if(t) 

a +j2jrf 

(a + j2jif) 2 + 47T 2 f ( 2 

17 

n G) 

x sine (irf r) 

18 

2 B sine (2 tt Bt) 

"(6) 

19 

*(L) 

T ■ 2 ( n f T \ 


W 

2 \ 2 ) 

20 

B sine 2 (tt Bt) 


21 

TZ-^Ht-nT) 

foT%L-<x> s <f ~ n fo) 

22 

e ~t 2 f2o 2 



a > 0 
a > 0 
a > 0 
a > 0 
a > 0 


£ > 0 
a > 0 


fo = T 


3.3.1 Time-Frequency Duality 

Equations (3*9) show an interesting fact: the direct and the inverse transform operations are 
remarkably similar. These operations, required to go from g(f) to Gif) and then from Gif ) 
to g(t) t are shown graphically in Fig, 3.14. The only minor difference between these two 
operations lies in the opposite signs used in their exponential indices. 
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Figure 3.14 

Near symmetry 
between direct 
and inverse 
Fourier 
transforms. 



This similarity has far-reaching consequences in the study of Fourier transforms. It is the 
basis of the so-called duality of time and frequency. The duality principle may he compared 
with a photograph and its negative. A photograph can be obtained from its negative, and 
by using an identical procedure , the negative can he obtained from the photograph. For any 
result or relationship between g(t) and G(f ), there exists a dual result or relationship, obtained 
by interchanging the roles of g(t) and Gif) in the original result (along with some minor 
modifications arising because of the factor 2 tt and a sign change). For example, the time- 
shifting property, to be proved later, states that if g(f) *=> Gif), then 

git - t Q ) <=> G(f)e~^ 

The dual of this property (the frequency-shifting property) states that 

g(tW 27zhl <=> G(f-f 0 ) 

Observe the role reversal of time and frequency in these two equations (with the minor differ¬ 
ence of the sign change in the exponential index). The value of this principle lies in the fact 
that whenever we derive any result , we can be sure that it has a dual. This knowledge can give 
valuable insights about many unsuspected properties or results in signal processing. 

The properties of the Fourier transform are useful not only in deriving the direct and 
the inverse transforms of many functions, but also in obtaining several valuable results in 
signal processing. The reader should not fail to observe the ever-present duality in this dis¬ 
cussion. We begin with the duality property, which is one of the consequences of the duality 
principle. 

3.3.2 Duality Property 

The duality property states that if 


g(t) <^=> G{f) 


then 


Git) 4 => *(-/) 


(3.26) 
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The duality property states that if the Fourier transform of g(t) is Gif) then the Fourier 
transform of G(t), with /' replaced by f, is the g{—f) which is the original time domain signal 
with t replaced by —f. 

Proof. From Eq. (3.9b), 

git) = f G(x)e )2,IJCI dx 

J-0 O 

Hence, 

/ GO 

G(jc)e~* 27txt dx 

-00 

Changing 1 1 of yields Eq. (3.26). ■ 
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Substituting t = lira, we obtain 


«=> In n (3.28b) 

In Eq. (3.8) we used the fact that n (-r) = n (r) because 11(f) is uneven function. Figure 
3.15b shows this pair graphically. Observe the interchange of the roles of r and 2 nf (with 
the minor adjustment of the factor 2 tt). This result appears as pair 18 in Table 3.1 (with 
t/2 = W). 



As ail interesting exercise, generate a dual of every pair in Table 3T by applying the duality 
property. 


3.3.3 Time-Scaling Property 

if 


g(0 Gif) 

then, for any real constant a , 

s(aO A g (") (3.29) 

l«l w 

Proof : For a positive real constant a , 

^[giat)] = f g{at)e~^ ln ^ dt = - ( g(x)e^ 2jr ^ il)x dx = -G 

J—oo a J-oc a \ a J 

Similarly, it can be shown that if a < 0, 


g(at) 


-1 

—G 



Hence follows Eq. (3.29). ■ 

Significance of the Time-Scaling Property 

The function g(at) represents the function g(t) compressed in time by a factor a (\a\ > 1). 
Similarly, a function Gif /a) represents the function Gif) expanded infrequency by the same 
factors The time-scaling property states that time compression of a signal results in its spectral 
expansion, and time expansion of the signal results in its spectral compression. Intuitively, 
compression in time by a factor a means that the signal is varying more rapidly by the 
same factor. To synthesize such a signal, the frequencies of its sinusoidal components must 
be increased by the factor a y implying that its frequency spectrum is expanded by the factor 
a . Similarly, a signal expanded in time varies more slowly; hence, the frequencies of its 
components are lowered, implying that its frequency spectrum is compressed. For instance, 
the signal cos 4 tt/o* is the same as the signal cos 2nfot time-compressed by a factor of 2. 
Clearly, the spectrum of the former (impulse at ±2 /q) is an expanded version of the spectrum 
of the latter (impulse at ±fy). The effect of this scaling is demonstrated in Fig. 3.16. 
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Figure 3.16 

Scaling property 
of the Fourier 
transform. 




Reciprocity of Signal Duration and Its Bandwidth 

The time-scaling property implies that if g(t) is wider, its spectrum is narrower, and vice 
versa. Doubling the signal duration halves its bandwidth, and vice versa. This suggests that the 
bandwidth of a signal is inversely proportional to the signal duration or width (in seconds). We 
have already verified this fact for the rectangular pulse, where we found that the bandwidth 
of a gate pulse of width r seconds is 1/r Hz. More discussion of this interesting topic can be 
found in the literature. 2 


Example 3 + 9 Show that 


g(-t) G(-f ) 


(3,30) 


Use this result and the fact that e at u(t) \/{a + jlnf), to find the Fourier transforms of 
e m u(—t) and 

Equation (3.30) follows from Eq + (3 + 29) by letting a = -1. Application of Eq. (3.30) to 
pair 1 of Table 3.1 yields 


e at u(-t) 


1 


a - jinf 


Also 


Therefore, 


e a |n _ e + e"' w (-r) 


- a ]f| 


1 


2a 


a f jlnf a - jinf a 1 + (Inf) 2 


(331) 
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Figure 3*17 

i’-^l and its 
Fourier spectrum. 



3.3.4 Time-Shifting Property 

if 


g(t) G(f) 


then 

g(t - to) *=* G(f)e~ M (3.32a) 


Proof: By definition. 


F[gU - *D)1 = 



g(t-ta)e j27!fi dt 


Letting t — t# = x y we have 


/ tXJ 

dx 

■00 

1 f g(x)e~ j2nfx dx = G(f)e- j2jlftn (3.32b) 

J-00 


— e ~)2nfU) 


This result shows that delaying a signal by to seconds does not change its amplitude spectrum. 
The phase spectrum t however, is changed by — 


Physical Explanation of the Linear Phase 

Time delay in a signal causes a linear phase shift in its spectrum. This result can also be derived 
by heuristic reasoning- Imagine g (/) being synthesized by its Fourier components, which are 
sinusoids of certain amplitudes and phases. The delayed signal g(t — to) can be synthesized by 
the same sinusoidal components, each delayed by to seconds* The amplitudes of the components 
remain unchanged. Therefore, the amplitude spectrum of g(t — to) is identical to that of g(f). 
The time delay of to in each sinusoid, however, does change the phase of each component* 
Now, a sinusoid cos 2xft delayed by to is given by 

cos 2izf(t — to) = cos (2 ;ift — 27zft§) 

Therefore, a time delay ft in a sinusoid of frequency/ manifests as a phase delay of2jr/to-Thisis 
a linear function of/, meaning that higher frequency components must undergo proportionately 
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Figure 3* 18 

Physical 
explanation of 
the time-shifting 
property. 



higher phase shifts to achieve the same time delay* This effect is shown in Fig* 3.18 with two 
sinusoids, the frequency of the lower sinusoid being twice that of the upper. The same time 
delay *o amounts to a phase shift of it/2 in the upper sinusoid and a phase shift of tt in the 
lower sinusoid. This verifies that to achieve the same time delay, higher frequency sinusoids 
must undergo proportionately higher phase shifts. 


Example 3.10 Find the Fourier transform of e 

I This function, shown in Fig. 3.19a, is a time-shifted version of e ~ afr| (shown in Fig. 3* 17a)* 
From Eqs* (3*31) and (3*32) we have 


—f 0 f 


a 2 + (2tt/) 2 




Figure 3+19 

Effect of time 
shifting on the 
Fourier spectrum 
of a signal. 


The spectrum of e (Fig. 3.19b) is the same as that of e a|r| (Fig* 3.17b), except for 

an added phase shift of -2jt/To- 





8 ,( f )=-2 aft / 

(a) (b) 


Observe that the time delay *o causes a linear phase spectrum — 2TtftQ. This example 
clearly demonstrates the effect of time shift. 
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3.3.5 Frequency-Shifting Property 

if 


git) *=> Gif) 


then 

g(0^ Af <=> Gif -A) (3.34) 

This property is also called the modulation property* 

Proof. By definition, 

/ OO /*DQ 

g{t)e> 2 ^ ! e- p - nft dt = / gife-^- 2 ^ dt = Gif -fo) 

-00 j “OO 


This property states that multiplication of a signal by a factor e shifts the spectrum 
of that signal by/ =/q. Note the duality between the time-shifting and the frequency-shifting 
properties* 

Changing/) to —fo in Eq. (3*34) yields 

s( t) e -M ^ Gif +f 0 ) (3.35) 

Because is not a real function that can be generated, frequency shifting in practice 
is achieved by multiplying g(t) by a sinusoid. This can be seen from 

gif) COS 2nfol = l - [g(()e /2irA ' 4- git)e~ j2,rfot J 

From Eqs. (3.34) and (3.35), it follows that 

g(0 cos Infot ^=> [Gif -/ 0 ) + Gif +/ 0 )] (3.36) 

This shows that the multiplication of a signal g(t) by a sinusoid of frequency/) shifts the 
spectrum G(f) by ±fy. Multiplication of a sinusoid cos 2nfot by g(t) amounts to modulating 
the sinusoid amplitude. This type of modulation is known as amplitude modulation. The 
sinusoid cos 2izfot is called the carrier, the signal g(t)\s the modulating signal, and the signal 
g(f)cos 2nfot is the modulated signal. Modulation and demodulation will be discussed in 
detail in Chapters 4 and 5. 

To sketch a signal g(r) cos 2nfot, we observe that 


g (f) cos 2nfyt^ 


git) 

-git) 


when cos = 1 
when cos Infot = — 1 


Therefore, g{t) cos 2nfot touches g{t) when the sinusoid cos 2jrfot is at its positive peaks and 
touches — g(f) when cos litfyt is at its negative peaks. This means that git) and — g(t) act as 
envelopes for the signal g(r)co$ Infyt (see Fig* 3*20c)* The signal - g(t ) is a mirror image 
of g(0 about the horizontal axis. Figure 3.20 shows the signals g(r), g{t) cos Infot, and their 
respective spectra. 
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Figure 3.20 

Amplitude 
modulation of a 
signal causes 
spectral shifting. 



Shifting the Phase Spectrum of a Modulated Signal 

We can shift the phase of each spectral component of a modulated signal by a constant amount 
Oq merely by using a carrier cos (2jt/o t + f?o) instead of cos 2ir fot. If a signal g(r) is multiplied 
by cos (lizfyr + %), then we can use an argument similar to that used to derive Eq. (3.36), to 
show that 


8(0 cos (2njbt + fc) <=► 1 [G(f -/o) + Gif +/ 0 ) e~^ (3.37) 

For a special case when Qq = -jt/ 2, Eq. (3.37) becomes 

git) sin 2nf()t <=► l - [c</ -fy) e~W 2 + Gif +/o) e' ir/2 ] (3.38) 

Observe that sin is cos 2jr/of with a phase delay of jt/2. Thus, shifting the carrier phase 
by jt/2 shifts the phase of every spectral component by jt/2. Figures 3.20e and f show the 
signal g{t) sin lizftf and its spectrum. 

Modulation is a common application that shifts signal spectra. In particular, If several 
message signals, each occupying the same frequency band, are transmitted simultaneously 
over a common transmission medium, they will all interfere; it will be impossible to separate 
or retrieve them at a receiver For example, if all radio stations decide to broadcast audio signals 
simultaneously, receivers will not be able to separate them. This problem is solved by using 
modulation, whereby each radio station is assigned a distinct carrier frequency. Each station 
transmits a modulated signal, thus shifting the signal spectrum to its allocated band, which is 
not occupied by any other station. A radio receiver can pick up any station by tuning to the 







Figure 3.21 

Bandpass signal 
and its spectrum. 
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band of the desired station. The receiver must now demodulate the received signal (undo the 
effect of modulation). Demodulation therefore consists of another spectral shift required to 
restore the signal to its original band. 

Bandpass Signals 

Figure 3.20(d)(f) shows that if g c {t) and g x (t) are low-pass signals, each with a bandwidth B 
Hz or 2nB rad/s, then the signals g c (t) cos 27r/of and gjt) sin are both bandpass signals 
occupying the same band, and each having a bandwidth of 2 B Hz. Hence, a linear combination 
of both these signals will also be a bandpass signal occupying the same band as that of the 
either signal, and with the same bandwidth (2 B Hz). Hence, a general bandpass signal gb p {t) 
can be expressed as* 


gb P (0 = gc(t) cos 27 ifot + g s (t) sin 2 Tzfyt (3.39) 

The spectrum of g bp (t) is centered at ±/o and has a bandwidth 2 B, as shown in Fig. 3.21. 
Although the magnitude spectra of both g c (0 cos lirfyt and sin lirfot are symmetrical 
about ±fo, the magnitude spectrum of their sum, gb p (t), is not necessarily symmetrical about 
±fo. This is because the different phases of the two signals do not allow their amplitudes to 
add directly for the reason that 

a\^ 9x + Q2^ ipl ^ (a\ - 1 - a2)^^ ]+92) 

Atypical bandpass signal g^(f)and its spectra are shown in Fig. 3.21 ♦ We can use a well-known 
trigonometric identity to express Eq. (3.39) as 

g bp (t) = E(t) co$[2jr fyt + ^(0] (3.40) 


where 


E(t) = +y/g*(t) + g}(t) 


Vf(0 = — tan 


' gs(t) ' 

_gc(0_ 


(3.41a) 

(3.41b) 


See Sec, 9.9 for a rigorous proof of this statement. 
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Because g ( . (t) and^(r) are low-pass signals, E(t) and ip- U) are also low-pass signals. Because 
E(t) is nonnegative [Eq. (3.41a)], it follows from Eq. (3.40) that E(t) is a slowly varying 
envelope and $ (f) is a slowly varying phase of the bandpass signal gb p {t), as shown in Fig. 3.21. 
Thus, the bandpass signal g; >P (t i will appear as a sinusoid of slowly varying amplitude. Because 
of the time-varying phase ^(f), the frequency of the sinusoid also varies slowly* with time 
about the center frequency ./o- 


Example 3.11 Find the Fourier transform of a general periodic signal g(t) of period To, and hence, determine 
the Fourier transform of the periodic impulse train ^(r) shown in Fig. 3.22a. 



(a) 


G(.f) = -S /o (/) 
to 


* A k k k 


■2f 0 -/o 0 /„ 2 f 0 

I /— 

(b) 


A periodic signal g(t) can be expressed as an exponential Fourier series as 

™ 1 


8(0= £ DnJ***" / 0 = 

n=-oo 

Therefore, 

g < 0 «=> £ T[Dtl e in2nht } 

n=— do 

Now from Eq. (3.22a), it follows that 

00 

git) 4 = 4 - ^2 D„S(f-nfy ) 


To 


(3.42) 


Equation (2.67) shows that the impulse train dr : , i> i can be expressed as an exponential 
Fourier series as 


1 00 i 

MO = jr £ fo = ~ 


7o 


* It is necessary that B ^ fy for a well-defined envelope. Otherwise the variations of E(f) are of the same order as 
the carrier, and it will be difficult to separate the envelope from the carrier 
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i 


Here D n = 1 /Tq. Therefore, from Eq. (3.42), 

I 00 

MO £ S<f~nfo) 

0 n=-oo 

= jrW A = ^r (3-43) 

Thus, the spectrum of the impulse train also happens to be an impulse train (in the frequency 
domain), as shown in Fig* 3.23b* 


3.3.6 Convolution Theorem 

The convolution of two functions g(t) and w(r), denoted by g(t) * w(f), is defined by the 
integral 


/ no 

g(r)w(t - z)dr 

-OO 

The time convolution property and its dual, the frequency convolution property, state 
that if 


£i(f) G\(f) and g 2 (t) G 2 (f ) 

then (time convolution) 

£i(0 *g2(0 <=> G\(f )G 2 (f ) (3*44) 

and (frequency convolution) 

8i(0g2U) G\(f ) * G 2 if) (3*45) 

These two relationships of the convolution theorem state that convolution of two signals in 
the time domain becomes multiplication in the frequency domain, while multiplication of two 
signals in the time domain becomes convolution in the frequency domain. 

Proof. By definition, 




/ oo r po c 

/ gi(T)g 2 {t- 

-oo U-cv 

/ oo r poo 

( t ) / e~ j27!fl gz(t - 

-CO LJ — 00 


z)d tJ dt 

r)rfrl dr 


The inner integral is the Fourier transform of g 2 {t — r), given by [time-shifting property in 
Eq* (3.32a)] G 2 (/>^' 2jr/r .Hence, 


/ oo 

gi(T)e-i 2 * f *G 2 <J)dT 

-OO 

/ OO 

gl (T)e-j 2,lfT dr = Gi{f)G 2 (f) 

-OO 
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Example 3.1 


The frequency convolution property (3.45) can be proved in exactly the same way by reversing 
the roles of git) and Gif), 

Bandwidth of the Product of Two Signals 

If gi (r) and g'it) have bandwidths B| and Bi Hz, respectively, the bandwidth of g\ (f)g2(0 is 
B\ 4- B% Hz. This result follows from the application of the width property of convolution 3 
to Eq. (3.45). This property states that the width of x * y is the sum of the widths of x andy, 
Consequently, if the bandwidth of g(r) is B Hz, then the bandwidth of g 2 (t) is 2 B Hz, and the 
bandwidth of g n (t) is nB Hz.* 


2 Using the time convolution property, show that if 

g {t) Gif) 

then 


f g(T)dr <=> + ^G((W) 

J— oc 


,/27t/ 2 


Because 


it follows that 


w(r - r) = 


1 r < t 
0 x > t 


giO * 




g(T)u(t - T)dz 


-f 

J — Q 


g(x)dr 


Now from the time convolution property [Eq. (3.44)], it follows that 


g(t)*u(t) ^G{f)U(f) 

1 


=G(f) 


1 


L/2 7rf 2 


+ zW) 


Gif) 1 „ 

~j£f + 2 C<0W > 

In deriving the last result we used pair 11 of Table 3.1 and Eq. (2.10a). 


(3.46) 


3.3.7 Time Differentiation and Time Integration 

if 


8(0 <=> Gif), 


* The width property of convolution does not hold in some pathological cases. It fails when the convolution of two 
functions is zero over a range even when both functions are nonzero [e,g. s sin 2xf$t «(() * Technically the 
property holds even in this case if in calculating the width of the convolved function, we take into account the range 
in which the convolution is zero. 
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then (time differentiation)* 

^jlnfGif) (3.47) 

and (time integration) 


L 


g{z)dn 


2¥i + lc(0)S(f) 

ftxf 2 


(3.48) 


Proof. Differentiation of both sides of Eq. (3.9b) yields 

}2nfGifW 2 ^ df 

This shows that 

dg(t) 


dt 

Repeated application of this property yields 

d n g{t) 

dt n 


VnjGtf) 


(jlnffCif) 


(3.49) 


The time integration property [Eq. (3.48)] already has been proved in Example 3.12. 


Example 3.13 Use the time differentiation property to find the Fourier transform of the triangular pulse A(f/r) 
shown in Fig. 3,23a. 


Figure 3*23 

Using the time 
differentiation 
property to 
Finding the 
Fourier transform 
of a 

piecewise-l inear 
signal. 





Valid only if the transform of dg{t)fdt exists. 
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To find the Fourier transform of this pulse, we differentiate it successively, as shown in 
Fig. 3.23b and c. The second derivative consists of a sequence of impulses (Fig, 3.23c). 
Recall that the derivative of a signal at a jump discontinuity is an impulse of strength equal 
to the amount of jump. The function dg(t)/dt has a positive jump of 2/r at t = ±r/2, 
and a negative jump of 4/r at t = 0. Therefore, 


= (, + !)_ MW + i (,.!)] (3.50) 

From the time differentiation property LEq. (3.49)1, 

tl ^ (j2:rf) 2 G(f) = -(2 Kf) 2 G(f) (3.51a) 

Also, from the time-shifting property [Eqs. (3.32)j, 

S(f-fo) e- j2 * fio (3.51b) 


Taking the Fourier transform of Eq, (3.50) and using the results in Eq. (3,51), we obtain 
2 / ,V(, - 4 


{jinf f Gif) = - -2 + J ( CO s itfi 


1) 


8 . 2 ( n f T \ 

--sin -) 

t ^ 2 ; 


and 


Gif) 


8 


(2tt/) 2 t 


sin' 


(nf t\ t j"sin(jr/r/2)1 2 x . , /jr/r\ 

(—Hb^H ! ("r) (152) 


The spectrum G(f) is shown in Fig. 3.23d. This procedure of finding the Fourier transform 
can be applied to any function g(t) made up of straight-line segments with g{f) —> 0 as 
Id oo. The second derivative of such a signal yields a sequence of impulses whose 
Fourier transform can be found by inspection. This example suggests a numerical method 
of finding the Fourier transform of an arbitrary signal g(t) by approximating the signal by 
straight-line segments. 


To provide easy reference, several important properties of Fourier transform are summa¬ 
rized in Table 3.2. 


3.4 SIGNAL TRANSMISSION THROUGH 
A LINEAR SYSTEM 

A linear time-invariant (LTI) continuous time system can be characterized equally well in either 
the time domain or the frequency domain. The LTI system model, illustrated in Fig. 3.24, 
can often be used to characterize communication channels. In communication systems and 
in signal processing, we are interested only in bounded-input-bounded-output (BIBO) stable 
linear systems. Detailed discussions on system stability can be found in the textbook by Lathi. 3 
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TABLE 3.2 

Properties of Fourier Transform Operations 


Operation 

g(t) 

G(f) 

Superposition 

i'l( f ) + £2(0 

G ] (f) + G 2 (f) 

Scalar multiplication 

kg(t) 

kG(f) 

Duality 

G(t) 

g(~f) 

Time scaling 

g{ax) 

r g (£) 

Time shifting 

g(t - to) 


Frequency shifting 

" (: ’’ 

G(f -/o) 

Time convolution 

Si (0* *2(0 

Gi(f)C 2 (f) 

Frequency convolution 

Sl(0g2(0 


Time differentiation 

A'(0 

(jlxffGif) 


dt’ 1 

Time integration 

fLoagWdx 

jSj + 


Figure 3.24 

Signal 
transmission 
through a linear 
time-invariant 
system. 


Input signal 


Time-domain *(/) 


Frequency-domain X(f) 


+- 


LTI system 
hif) 
Hif) 


Output signal 

F(/)= H(f) Xif) 


A stable LTI system can be characterized in the time domain by its impulse response h(t) t which 
is the system response to a unit impulse input, that is, 

y(0 — h(t) when x(f) = <5(0 

The system response to a bounded input signal x(r) follows the convolutional relationship 

y(t) = h(t)*x(t) (3-53) 

The frequency domain relationship between the input and the output is obtained by taking 
Fourier transform of both sides of Eq. (3.53). We let 

X (t) <=* Xif) 
y(t) <=> Ytf) 
h(t) *=> Hif) 

Then according to the convolution theorem, Eq t (3.53) becomes 

Yif)=Hif)*Xif) (3.54) 

Generally Hif), the Fourier transform of the impulse response hit ), is referred to as the 
transfer function or the frequency response of the LTI system. Again, in general, Hif) is 
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complex and can be written as 


H{f) = \H (f)\e> 6,,<f) 

where \H(f) \ is the amplitude response and#&(/) is the phase response of the LTI system. 

3.4.1 Signal Distortion during Transmission 

The transmission of an input signal x(t) through a system changes it into the output signal y(t). 
Equation (3.54) shows the nature of this change or modification. Here X if) and Y(f) are the 
speetra of the input and the output, respectively. Therefore, H if) is the spectral response of the 
system. The output spectrum is given by the input spectrum multiplied by the spectral response 
of the system. Equation (3.54) clearly brings out the spectral shaping (or modification) of the 
signal by the system. Equation (3.54) can be expressed in polar form as 

Therefore, we have the amplitude and phase relationships 

\Y(f)\ = \X(f)\\H(f)\ (3,55a) 

Oyif) =9 x (f) + e h (f) (3,55b) 

During the transmission, the input signal amplitude spectrum |X(/)| is changed to |X(f)| ■ 
\H(f)\. Similarly, the input signal phase spectrum & x if) is changed to 0 x (f) H- #/,(/). 

An input signal spectral component of frequency/ is modified in amplitude by a factor 
\H(f)\ and is shifted in phase by an angle $k(f)> Clearly, \H<f)\ is the amplitude response, 
and 6 h (f) is the phase response of the system. The plots of |/7{/)| and O^if ) as functions of 
/ show at a glance how the system modifies the amplitudes and phases of various sinusoidal 
inputs. This is why Hif) is called the frequency response of the system. During transmission 
through the system, some frequency components may be boosted in amplitude, while others 
may be attenuated. The relative phases of the various components also change. In general, the 
output waveform will be different from the input waveform. 

3.4.2 Distortionless Transmission 

In several applications, such as signal amplification or message signal transmission over a 
communication channel, we require the output waveform to be a replica of the input waveform. 
In such cases, we need to minimize the distortion caused by the amplifier or the communication 
channel. It is therefore of practical interest to determine the characteristics of a system that 
allows a signal to pass without distortion (distortionless transmission). 

Transmission is said to be distortionless if the input and the output have identical wave 
shapes within a multiplicative constant. Adelayed output that retains the input waveform is also 
considered distortionless. Thus, in distortionless transmission, the input x(/) and the output 
y(f) satisfy the condition 

y(t) = k ■ x(r — t d ) (3.56) 

The Fourier transform of this equation yields 

Yif) =kXif)e~ jl7!fti 
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Figure 3*25 

Linear 

time-invariant 
system frequency 
response for 
distortionless 
transmission. 



But because 


Y(f)=X(f)H(f) 


we therefore have 


H(f) = k 


This is the transfer function required for distortionless transmission. From this equation it 
follows that 


\Htf)\=k <3*57a) 

0 h {f) ^-27rft d (3,57b) 


This shows that for distortionless transmission, the amplitude response |//(/)| must be a 
constant, and the phase response 0^ (f) must be a linear function off going through the origin 
/ = 0, as shown in Fig* 3.25. The slope of %(f ) with respect to the angular frequency cc = 2nf 
is — t d , where t d is the delay of the output with respect to the input* 

AINPass vs. Distortionless System 

In circuit analysis and filter designs, we sometimes are mainly concerned with the gain of a 
system response* An all-pass system has a constant gain for all frequencies [i*e., (f)| = it]* 

without the linear phase requirement. Note from Eq. (3*57) that a distortionless system is 
always an all-pass system, whereas the converse is not true* Because it is very common for 
beginners to be confused by the difference between all-pass and distortionless systems, now 
is the best time to clarify. 

To see how an all-pass system may lead to distortion, let us consider an illustrative example* 
Imagine that we would like to transmit a recorded music signal from a violin-cello duet. The 
violin contributes to the high frequency part of this music signal, while the cello contributes to 
the bases part* When this music signal is transmitted through a particular all-pass system, both 
parts have the same gain. However, suppose that this all-pass system would cause a 1-second 
extra delay on the high-frequency content of the music (from the violin). As a result, the 
audience on the receiving end will hear a “music” signal that is totally out of sync even though 
all signal components have the same gain and all are present The difference in transmission 
delay for components of different frequencies is contributed by the nonlinear phase of H(f) 
in the all-pass filter* 


* In addition, we require that 0^(0) either be 0 (as shown in Fig. 3.25) or have a constant value nx (n an integer), 
that is* &h(f) = tin — 2 fffi d . The addition of the excess phase of nff may at most change the sign of the signal. 
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To be more precise, the transfer function gain |Z/(/)| determines the gain of each input 
frequency component, whereas ZH(f) determines the delay of each component. Imagine a 
system input a( 0 consisting of multiple sinusoids (its spectral components). For the output 
signal y(t) to be distortionless, it should be the input signal multiplied by a gain k and delayed 
by t d . To synthesize such a signal, y(t) needs exactly the same components as those of a( 0, 
with each component multiplied by k and delayed by t d . This means that the system transfer 
function H(f) should be such that each sinusoidal component encounters the same gain (or 
loss) k and each component undergoes the same time delay of t d seconds. The first condition 
requires that 


I Hif)\=k 

We have seen earlier (Sec. 3.3) that to achieve the same time delay t d for every frequency 
component requires a linear phase delay 2 irfi d (Fig. 3.18) through the origin 

9 h (f) = -2 nft d 


In practice, many systems have a phase characteristic that may be only approximately 
linear. A convenient method of checking phase linearity is to plot the slope of ZH(f ) as a 
function of frequency. This slope can be a function of/ in the general case and is given by 


t d (f) = 


1 dO h (f) 
2jr df 


(3.58) 


If the slope of is constant (that is, if $ fl is linear with respect to/), all the components 
are delayed by the same time interval t d . But if the slope is not constant, then the time delay 
t d varies with frequency. This means that different frequency components undergo different 
amounts of time delay, and consequently the output waveform will not be a replica of the 
input waveform (as in the example of the violin-cello duet). For a signal transmission to be 
distortionless, t d (f) should be a constant t d over the frequency band of interest/ 

Thus, there is aclear distinction between all-pass and distortionless systems. It is a common 
mistake to think that flatness of amplitude response |// (/) | alone can guarantee signal quality. 
A system that has a flat amplitude response may yet distort a signal beyond recognition if the 
phase response is not linear ( t d not constant). 


The Nature of Distortion in Audio and Video Signals 

Generally speaking, a human ear can readily perceive amplitude distortion, although it is 
relatively insensitive to phase distortion. For the phase distortion to become noticeable, the 


+ Figure 3.25 shows that for distort! on less transmission, the phase response not only is linear but also must pass 
through the origin. This latter requirement can be somewhat relaxed for bandpass signals. The phase at the origin 
may be any constant \0^{f) = 2nft d or 0j,(O) = Gq]. The reason for this can be found in Eq. (3.37), which 
shows that the addition of a constant phase #q to a spectrum of a bandpass signal amounts to a phase shift of the 
carrier by % The modulating signal (the envelope) is not affected. The output envelope is the same as the input 
envelope delayed by 

1 dO h (j-) 

e 2 tt df 

called the group delay or envelope delay, and the output carrier is the same as the input carrier delayed by 

P 27if 

called the phase delay, where fy is the center frequency of the passband. 
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variation in delay (variation in the slope of Ok) should be comparable to the signal duration (or 
the physically perceptible duration, in case the signal itself is long). In the case of audio signals, 
each spoken syllable can be considered to be an individual signal. The average duration of a 
spoken syllable is of a magnitude on the order of 0*01 to 0.1 second. The audio systems may 
have nonlinear phases, yet no noticeable signal distortion results because in practical audio 
systems, maximum variation in the slope of Ok is only a small fraction of a millisecond. This 
is the real reason behind the statement that “the human ear is relatively insensitive to phase 
distortion* 4 As a result, the manufacturers of audio equipment make available only \H(f)\, the 
amplitude response characteristic of their systems. 

For video signals, on the other hand, the situation is exactly the opposite* The human 
eye is sensitive to phase distortion but is relatively insensitive to amplitude distortion* The 
amplitude distortion in television signals manifests itself as a partial destruction of the relative 
half-tone values of the resulting picture, which is not readily apparent to the human eye. The 
phase distortion (nonlinear phase), on the other hand, causes different rime delays in different 
picture elements. This results in a smeared picture, which is readily apparent to the human eye. 
Phase distortion is also very important in digital communication systems because the nonlinear 
phase characteristic of a channel causes pulse dispersion (spreading out), which in turn causes 
pulses to interfere with neighboring pulses. This interference can cause an error in the pulse 
amplitude at the receiver: a binary 1 may read as 0, and vice versa. 


3.5 IDEAL VERSUS PRACTICAL FILTERS 

Ideal filters allow distortionless transmission of a certain band of frequencies and suppress 
all the remaining frequencies* The ideal low-pass filter (Fig* 3.26), for example, allows all 
components below/ = B Hz to pass without distortion and suppresses all components above 
/ = B * Figure 3*27 shows ideal high-pass and bandpass filter characteristics. 

The ideal low-pass filter in Fig. 3.26a has a linear phase of slope — fj, which results in a 
time delay of seconds for all its input components of frequencies below B Hz. Therefore, if 
the input is a signal g(r) band-limited to B Hz, the output y(t) is g(t) delayed by r^, that is, 

y(0 -g(t- t d ) 

The signal #(r) is transmitted by this system without distortion, but with time delay fy. 
For this filter \H (f)| = Tl(f/2B ), and 0h(f) = —2so that 

tuf ) = n (3.59a) 


Figure 3.26 

Ideal low pass 
filter frequency 
response and its 
impulse 
response. 
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Figure 3.27 

Ideal high-pass 
and bandpass 
filter frequency 
responses. 





im/)l 






■ 



6 

L ’ ..B 

/-* 

A </) 














0 


./o 

f-» 


The unit impulse response h(t) of this filter is found from pair 18 in Table 3.1 and the time- 
shifting property: 


2 b) 

= 2 B sine [2nB(t - l d ) J (3.59b) 

Recall that h{t) is the system response to impulse input S(t)< which is applied at t — 0, Figure 
3,26b shows a curious fact: the response h(t) begins even before the input is applied (at f = 0), 
Clearly, the filter is noncausal and therefore unrealizable; that is, such a system is physically 
impossible, since no sensible system can respond to an input before it is applied to the system. 
Similarly, one can show that other ideal filters (such as the ideal high-pass or the ideal bandpass 
filters shown in Fig, 3,27) are also physically unrealizable. 

For a physically realizable system, hit) must be causal; that is, 

h(t) = 0 forr < 0 

In the frequency domain, this condition is equivalent to the Paley-Wiener criterion, which 
states that the necessary and sufficient condition for \H(f)\ to be the amplitude response of a 
realizable (or causal) system is* 


hit) 


n 



In \H (/~)j | 

1 + (2jt/) 2 


df < OQ 


(3.60) 


If H(f ) does not satisfy this condition, it is unrealizable. Note that if \H(f)\ = 0 over any 
finite band, | ln|//(/)| | = oc over that band, and the condition (3,60) is violated. If, however, 
H(f ) = 0 at a single frequency (or a set of discrete frequencies), the integral in Eg, (3.60) 
may still be finite even though the integrand is infinite. Therefore, for a physically realizable 
system, H(f ) may be zero at some discrete frequencies, but it cannot be zero over any finite 
band. According to this criterion, ideal filter characteristics (Figs. 3,26 and 3,27) are clearly 
unrealizable. 


* is assumed to be square integrable. That is. 



l«(/') 1 :2 df 


is assumed to be finite. 
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The impulse response h{t) in Fig. 3.26 is not realizable. One practical approach to filter 
design is to cut off the tail of hit) for t < 0. The resulting causal impulse response hit), where 

h(t) = k(t)u(t) 

is physically realizable because it is causal (Fig. 3.28). If t d is sufficiently large, %{t) will be a 
close approximation of hit), and the resulting filter H(f) will be a good approximation of an 
ideal filter. This close realization of the ideal filter is achieved because of the increased value 
of time delay This means that the price of close physical approximation is higher delay in 
the output; this is often true of noncausal systems. Of course, theoretically a delay = oo is 
needed to realize the ideal char ac tens tics. But a glance at Fig. 3.27b shows that a delay t d of 
three or four times tt/W will make h(t) a reasonably close version of hit — t d ). For instance, 
audio filters are required to handle frequencies of up to 20 kHz (the highest frequency the 
human ear can hear). In this case a t d of about 10 -4 (0.1 ms) would be a reasonable choice. 
The truncation operation [cutting the tail of h(t) to make it causal], however, creates some 
unsuspected problems of spectral spread and leakage, and which can be partly corrected by 
using a tapered window function to truncate *(0 gradually (rather than abruptly)^ 

In practice, we can realize a variety of filter characteristics to approach ideal charac¬ 
teristics. Practical (realizable) filter characteristics are gradual, without jump discontinuities 
in the amplitude response |//(/')|. For example, Butterworth and Chebychev filters are used 
extensively in various applications including practical communication circuits. 

Analog signals can also be processed by digital means (A/D conversion). This involves 
sampling, quantizing, and coding. The resulting digital signal can be processed by a small, 
special-purpose digital computer designed to convert the input sequence into a desired output 
sequence. The output sequence is converted back into the desired analog signal. A special 
algorithm of the processing digital computer can be used to achieve a given signal operation 
(e.g., low-pass, bandpass, or high-pass filtering). The subject of digital filtering is somewhat 
beyond our scope in this book. Several excellent books are available on the subject. 3 


3.6 SIGNAL DISTORTION OVER A 
COMMUNICATION CHANNEL 

A signal transmitted over a channel is distorted because of various channel imperfections. The 
nature of signal distortion will now be studied. 

3.6.1 Linear Distortion 


We shall first consider linear time-invariant channels. Signal distortion can be caused over 
such a channel by nonideal characteristics of magnitude distortion, phase distortion, or both. 



98 


ANALYSIS AND TRANSMISSION OF SIGNALS 


We can identify the effects these nonideatities will have on a pulse g(t) transmitted through 
such a channel- Let the pulse exist over the interval ( a , b) and be zero outside this interval. The 
components of the Fourier spectrum of the pulse have such a perfect and delicate balance of 
magnitudes and phases that they add up precisely to the pulse g(r) over the interval {a, b) and 
to zero outside this interval. The transmission of g{t) through an ideal channel that satisfies the 
conditions of distortionless transmission also leaves this balance undisturbed, because a dis¬ 
tortionless channel multiplies each component by the same factor and delays each component 
by the same amount of time. Now, if the amplitude response of the channel is not ideal [i.e., 
if |# (01 is not equal to a constant], this delicate balance will be disturbed, and the sum of all 
the components cannot be zero outside the interval (a, £). In short, the pulse will spread out 
(see Example 3,14). The same thing happens if the channel phase characteristic is not ideal, 
that is, if Okif) # Thus, spreading, or dispersion, of the pulse will occur if either the 

amplitude response or the phase response, or both, are nonideal. 

Linear channel distortion (dispersion in time) is particularly damaging to digital communi¬ 
cation systems. It introduces what is known as intersymbol interferences (ISI), In other words, 
a digital symbol, when transmitted over a dispersive channel, tends to spread more widely 
than its allotted time. Therefore, adjacent symbols will interfere with one another, thereby 
increasing the probability of detection error at the receiver. 


Example 3* 14 A low-pass filter (Fig. 3*29a) transfer function H(f) is given by 




(1 -j- k cos 2jzfT)e~^ 27l ^ d 

0 


\f \<B 
]f\>B 


(3.61) 


Apulseg(r) band-limited to B Hz (Fig* 3*29b) is applied at the input of this filter. Find the 
output y(t)< 


Figure 3.29 

PuT$e is 

dispersed when 
it passes through 
a system that is 
not distortionless* 
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This filter has ideal phase and nonideal magnitude characteristics. Because g(t) <=> 
Gif), y(t) & Y(f) and 

Y(f) = G(J')Hif) 

= } ' 0 {{§) ° + k C ° S ln f I "> tH2n}ld 


C(f)e 


-jixjid 


+ & [0(f) cos Itt/T] e 


-jlnfid 


(3,62) 


Note that in the derivation of Eq. (3.62) because g(/) is band-limited to B Hz, we have 
£ Gif) - FI = Gif). Then, by using the time-shifting property and Eq. (3.32a), we 
have 

, it 

y(t) = g{t - t d ) 4- ~[g(t - t d -T) + g(t -t d -b T)} (3.63) 

t The output is actually g(0 + {£/2)[g(r ~ T)-bg(f + T)] delayed by t d , It consists ofg(0 

^ and its echoes shifted by ±t d . The dispersion of the pulse caused by its echoes is evident 

| from Fig. 3.29c. Ideal amplitude but nonideal phase response of H if ) has a similar effect 
(see Prob. 3.6-1). 


3.6.2 Distortion Caused by Channel Nonlinearities 

Until now we have considered the channel to be linear. This approximation is valid only 
for small signals. For large signal amplitudes, nonlinearities cannot be ignored. A general 
discussion of nonlinear systems is beyond our scope. Here we shall consider a simple case 
of a memoryless nonlinear channel where the input g and the output y are related by some 
(memoryless) nonlinear equation, 


v =f(g) 

The right-hand side of this equation can be expanded in a Maclaurin series as 

y(t) = ao -f aig(t) -b a 2 g 2 (t) + a^g 3 (t) + ♦■■+- a k g k {t) -\ - 

Recall the result in Sec. 3.3.6 (convolution) that if the bandwidth of g(f) is B Hz, then the 
bandwidth of g k (t) i$kB Hz. Hence, the bandwidth ofy(0 is greater than kB Hz. Consequently, 
the output spectrum spreads well beyond the input spectrum, and the output signal contains 
new frequency components not contained in the input signal In broadcast communication, we 
need to amplify signals at very high power levels, where high-efficiency (class C) amplifiers are 
desirable. Unfortunately, these amplifiers are nonlinear, and they cause distortion when used 
to amplify signals. This is one of the serious problems in AM signals. However, FM signals 
are not affected by nonlinear distortion, as shown in Chapter 5, If a signal is transmitted over 
a nonlinear channel, the nonlinearity not only distorts the signal but also causes interference 
with other signals on the channel because of its spectral dispersion (spreading). 

For digital communication systems, the nonlinear distortion effect is in contrast to the 
time dispersion effect due to linear distortion. Linear distortion causes interference among 
signals within the same channel, whereas spectral dispersion due to nonlinear distortion causes 
interference among signals using different frequency channels. 
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Example 3.15 The input x(0 and the output y(t) of a certain nonlinear channel are related as 

y(t) = x(r) -b 0.0001 58 jt(0 


Find the output signal y{t) and its spectrum Y(f) if the input signal is x(t) — 2000 
sine (2000 jtO- Verify that the bandwidth of the output signal is twice that of the input sig¬ 
nal. This is the result of signal squaring. Can the signal x(r) be recovered (without distortion) 
from the output y(0? 


Since 


x(t) = 2000 sine (2000TTf) 


XV) = n (m) 


We have 


y(t) = x{l) +0.000158^(0 = 2000 sine (2000 jt 0 +0.316-2000 sine 1 2 3 4 (2000^0 


Y(f) = n 



+ 0.316 A 



Observe that 0.316 ■ 2000sinc 2 (2000 jtO is the unwanted (distortion) term in the received 
signal. Figure 3.30a shows the input (desired) signal spectrum X(f)\ Fig. 3.30b shows 
the spectrum of the undesired (distortion) term; and Fig. 3.30c shows the received signal 
spectrum Y(f). We make the following observations. 

1. The bandwidth of the received signal y(t) is twice that of the input signal x(f) (because 
of signal squaring). 

i 2. The received signal contains the input signal x(r) plus an unwanted signal 
632sinc 2 (20007r0- The spectra of these two signals are shown in Fig. 3.30a and b. 
Figure 3.30c shows Y(f), the spectrum of the received signal. Note that spectra of 
the desired signal and the distortion signal overlap, and it is impossible to recover the 
signal x(f) from the received signal y(0 without some distortion. 

3. We can reduce the distortion by passing the received signal through a low-pass filter 
of bandwidth 1000 Hz. The spectrum of the output of this filter is shown in Fig. 3.30d. 
Observe that the output of this filter is the desired input signal x(r) with some residual 
distortion. 

4. We have an additional problem of interference with other signals if the input signal x(r) 
is frequency-division-multiplexed along with several other signals on this channel. This 
means that several signals occupying nonoverlapping frequency bands are transmitted 
simultaneously on the same channel. Spreading the spectrum X (f) outside its original 
band of 1000 Hz will interfere with the signal in the band of 1000 to 2000 Hz. Thus, 
in addition to the distortion of x(r), we have an interference with the neighboring 
band. 
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Figure 3.30 

Signal distortion 
caused by 
nonlinear 
operation, 

(a) Desired 
(input) signal 
spectrum. 

(b) Spectrum of 
the unwanted 
signal (distortion) 
in the received 
signal, 

(c) Spectrum of 
the received 
signal. 

(d) Spectrum of 
the received 
signal after 
low-pass 
filtering. 



5. If x(r) were a digital signal consisting of a pulse train, each pulse would be dis¬ 
torted, but there would be no interference with the neighboring pulses. Moreover even 
with distorted pulses, data can be received without loss because digital communica¬ 
tion can withstand considerable pulse distortion without loss of information. Thus, 
if this channel were used to transmit a time-division multiplexed signal consisting 
of two interleaved pulse trains, the data in the two trains would be recovered at the 
receiver. 


3.6.3 Distortion Caused by Multipath Effects 

A multipath transmission occurs when a transmitted signal arrives at the receiver by two or 
more paths of different delays. For example, if a signal is transmitted over a cable that has 
impedance irregularities (mismatching) along the path, the signal will arrive at the receiver 
in the form of a direct wave plus various reflections with various delays. In radio links, the 
signal can be received by direct path between the transmitting and the receiving antennas and 
also by reflections from other objects, such as hills and buildings. In long-distance radio links 
using the ionosphere, similar effects occur because of one-hop and mulfihop paths. In each 
of these cases, the transmission channel can be represented as several channels in parallel, 
each with a different relative attenuation and a different time delay. Let us consider the case 
of only two paths: one with a unity gain and a delay td, and the other with a gain a and a 
delay ^ + Af, as shown in Fig. 3.31a. The transfer functions of the two paths are given by 
an d ae -j2xf(ttt+Ar)' reS p eC (ively. The overall transfer function of such a channel is 
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Figure 3.31 

Multipath 

transmission. 



Received 

signal 



Hif), given by 


H(f ) = e ~ j2nftd +ae~ j27Tf{ ‘ d+At) 

= e ~W td (1 + ue~ j2nfAt ) (3,64a) 

= e~ d2n ^ d (1 + a cos 2a f At —ja sin 2~j A?) 


= yj 1 + a 2 + 2a cos Inf At exp 






i „ r -i tfsin2jr/Af \ 

■j I iKftd -f tan 1 -——— ) 

' 1 4- a cos 2nf At / 


(3Mb) 


Both the magnitude and the phase characteristics of //(/') are periodic in / with a period of 
1/ At (Fig. 331b). The multipath channel, therefore, can exhibit nonidealities in the magnitude 
and the phase characteristics of the channel and can cause linear distortion (pulse dispersion), 
as discussed earlier. 

If, for instance, the gains of the two paths are very close, that is, a ^ 1, then the signals 
received from the two paths may have opposite phase (;r radians apart) at certain frequen¬ 
cies. This means that at those frequencies where the two paths happen to result in opposite 
phases, the signals from the two paths will almost cancel each other. Equation (3.64b) shows 
that at frequencies where/ = n/(2At) (n odd), cos Inf At = —1, and \H(f)\ ^ 0 when 
a & 1. These frequencies are the multipath null frequencies. At frequencies/ = rc/(2Ar) 
(n even), the two signals interfere constructively to enhance the gain. Such channels cause 
frequency-selective fading of transmitted signals. Such distortion can be partly corrected by 
using the tapped delay-line equalizer, as shown in Prob. 3.6-2. These equalizers are useful in 
several applications in communications. Their design issues are addressed later in Chapters 7 
and 12. 



3.7 Signal Energy and Energy Special Density 103 


3.6.4 Fading Channels 

Thus far, the channel characteristics have been assumed to be constant with time. In prac¬ 
tice, we encounter channels whose transmission characteristics vary with time. These include 
troposcatter channels and channels using the ionosphere for radio reflection to achieve long¬ 
distance communication. The time variations of the channel properties arise because of semi 
periodic and random changes in the propagation characteristics of the medium. The reflection 
properties of the ionosphere, for example, are related to meteorological conditions that change 
seasonally, daily, and even from hour to hour, much like the weather. Periods of sudden storms 
also occur. Hence, the effective channel transfer function varies semi periodically and ran¬ 
domly, causing random attenuation of the signal. This phenomenon is known as fading. One 
way to reduce the effects of slow fading is to use automatic gain control (AGC).* 

Fading may be strongly frequency dependent where different frequency components 
are affected unequally. Such fading, known as frequency-selective fading, can cause serious 
problems in communication. Multipath propagation can cause frequency-selective fading. 


3.7 SIGNAL ENERGY AND ENERGY 
SPECTRAL DENSITY 

The energy E s of a signal g(t) is defined as the area under \g(t)\ 2 . We can also determine the 
signal energy from its Fourier transform Gif) through Parseval’s theorem. 


3.7.1 Parseval's Theorem 


Signal energy can be related to the signal spectrum Gif) by substituting Eq. (3.9b) in Eq. (2.2): 


=/: 




/ oc r roo 

g(t) / G*if)e- 
■oo L ./-00 




df 


dt 


Here, we used the fact that being the conjugate of g(f), can be expressed as the conjugate 
of the right-hand side of Eq. (3.9b). Now, interchanging the order of integration yields 


/ oo r poo 

G*if) / g(t)e~ j2 * fi dt 

-oo Lv-oo 


-00 

fOO 


df 


= [ G(f)G*if)df 

J-oo 

■JT 


\G(f)\ 2 df 


(3.65) 


This is the well-known statement of Parseval theorem. A similar result was obtained for a 
periodic signal and its Fourier series in Eq. (2.68). This result allows us to determine the signal 
energy from either the time domain specification g(r) or the frequency domain specification 
G(f ) of the same signal. 


* AGC will also suppress slow variations of the original signal. 
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Example 3.1 6 Verify Parsevafs theorem for the signal g(f) = e a *u(t) (a > 0). 
We have 


/: 


g (t)dt 


e 2ai dt = — 


* 8 ~' - " 2a 

We now determine Eg from the signal spectrum G(f ) given by 

1 


G(f) 


jin/ + a 


and from Eq. (3.65), 


/ OO rC 

JG(f)\ 2 df = j 


1 ^ 1 2nf 

- ■ —v ■—r' df = — tan 1 - 

( 2 nf ) 2 -\-a 2 ‘ 2 rta a 


(3.66) 


2a 


which verifies ParsevaTs theorem. 


3.7.2 Energy Spectral Density (ESD) 

Equation (3.65) can be interpreted to mean that the energy of a signal g(t) is the result of 
energies contributed by all the spectral components of the signal g(t ). The contribution of a 
spectral component of frequency/ is proportional to \G(f) | 2 . To elaborate this further, consider 
a signal g(t) applied at the input of an ideal bandpass filter, whose transfer function H(f ) is 
shown in Fig. 3.32a. This filter suppresses all frequencies except a narrow band A/ (A/ 0) 

centered at angular frequency a>o (Fig, 3.32b). If the filter output is y(f), then its Fourier 
transform Y(f) = G(f)H(f ), and E yy the energy of the output y(r), is 

/ oo 

\G(f)H(f)\ 2 df (3.67) 

-OO 

Because H(f ) = 1 over the passband A/, and zero everywhere else, the integral on the 
right-hand side is the sum of the two shaded areas in Fig. 3.32b, and we have (for A/ —> 0) 

Ey = 2\Gm 2 4f 

Thus, 2\G(f )\ 2 df is the energy contributed by the spectral components within the two narrow 
bands, each of width A/ Hz, centered at ±/q. Therefore, we can interpret \Gif)\ 2 as the energy 
per unit bandwidth (in hertz) of the spectral components of g(0 centered at frequency/. 
In other words, |G(/>| 2 is the energy spectral density (per unit bandwidth in hertz) of g(t). 
Actually, since both the positive- and the negative-frequency components combine to form the 
components in the band A/, the energy contributed per unit bandwidth is 2\G(f)\ 2 . However, 
for the sake of convenience we consider the positive- and negative-frequency components to 
be independent. The energy spectral density (ESD) ^(0 is thus defined as 

%(f) = \G(f)\ 2 


(3.68) 
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Figure 3*32 
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and Eq, (3*65) can be expressed as 


£ 




(3.69a) 


From the results in Example 3.16, the ESD of the signal g(t ) — e~ a! u(t) is 


= \G(f)\ 2 


1 

(2 Ttf) 1 + a 1 


(3.69b) 


3.7.3 Essential Bandwidth of a Signal 

The spectra of most signals extend to infinity. However, because the energy of a practical signal 
is finite, the signal spectrum must approach 0 as/ -*■ oo. Most of the signal energy is contained 
within a certain band of B Hz, and the energy content of the components of frequencies greater 
than B Hz is negligible. We can therefore suppress the signal spectrum beyond B Hz with little 
effect on the signal shape and energy. The bandwidth B is called the essential bandwidth of the 
signal. The criterion for selecting B depends on the error tolerance in a particular application. 
We may, for instance, select B to be that bandwidththatcontains95%ofthe signal energy.* The 
energy level may be higher or lower than 95%, depending on the precision needed. We can use 
such a criterion to determine the essential bandwidth of a signal. Suppression of all the spectral 
components of g(f) beyond the essential bandwidth results in a signal g(t ), which is a close 
approximation of g(f). 1 If we use the 95% criterion for the essential bandwidth, the energy of 
the error (the difference) g(f) — g(t) is 5% of Eg. The following example demonstrates the 
bandwidth estimation procedure. 


* Essential bandwidth for a low-pass signal may also be defined as a frequency at which the value of the amplitude 
spectrum is a small fraction (about 5-10%) of its peak value. In Example 3,16, the peak of |G(/)| is i/a, and it 
occurs at/ = 0. 

f In practice the truncation is performed gradually, by using tapered windows, to avoid excessive spectral leakage 
due to the abrupt truncation. 5 
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Example 3. 1 7 Estimate the essential bandwidth W (in rad/s) of the signal if the essential band is 

required to contain 95% of the signal energy. 


Figure 3.33 

Estimating the 
essential 
bandwidth of a 
signal. 


In this case, 


G(f) = 


1 

jlxf H- a 


and the ESD is 


iccn 2 i 


i 

(2jt/) 2 + a 2 



This ESD is shown in Fig. 3.33. Moreover, the signal energy E g is the area under this 
ESD, which has already been found to be l/2o. Let W rad/s be the essential bandwidth, 
which contains 95% of the total signal energy E g . This means 1/2.t times the shaded area 
in Fig. 3.33 is 0.95 /2a, that is, 


0.95 _ f w ^ df 
2a J-w/ 2 x ( 2 nf) 2 + a 2 


2 na 


tan 


2 t zf 


W/2ji 

-W/2tt 


i _t 

— tan 

na 


W 

a 


or 


m _ i yy 

—-— — tan — => W = 12.1 a rad/s 
I 2 a 

In terms of hertz, the essential bandwidth is 

IV 

B = — = 2.02 a Hz 
2 jt 

This means that in the band from 0 (dc) to 12.7 x a rad/s (2.02 x a Hz), the spectral 
components of g{t) contribute 95% of the total signal energy; all the remaining spectral 
components (in the band from 2.02 x a Hz to oo) contribute only 5% of the signal energy/ 


* Note that although the ESD exists over the band — oc to oo, the trigonometric spectrum exists only over the band 0 
to oc. The spectrum range -oc to oo applies to the exponential spectrum. In practice, whenever we talk about a 
bandwidth, we mean it in the trigonometric sense. Hence, the essential band is from 0 to B Hz (or W rad/s), not from 
-B to B. 
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Example 3.18 Estimate the essential bandwidth of a rectangular pulse g{t) = U it/T) (Fig. 3 + 34a), where 
the essential bandwidth is to contain at least 90% of the pulse energy. 


Figure 3.34 

(a] EX-FGN/FGC 
rectangular 
function, (b) its 
energy spectral 
density, and 
(cj fraction of 
energy inside 
B(H 2 ), 
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For this pulse, the energy E s is 


Also because 


3 * 




dt = T 



<==> T sine (nff) 
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the ESD for this pulse is 

Vgtf) = |G(0| 2 = T 1 sine 2 (tiJT) 

This ESD is shown in Fig* 334b as a function of ojT as well as/7\ where/ is the frequency 
in hertz. The energy E B within the band from 0 to B Hz is given by 

f B 

E b = / T 2 sine 2 (77 fT) df 
J-B 

Setting 2:r/7' — x in this integral so that df = dx /(IttT), we obtain 


T c 2 ttBT 

eb= *L sinc (2) * 

Also because Eg = 7\ we have 


Eb 

Eg 


1 rh tBT 

H si "' 2 (D 


dx 


I The integral on the right-hand side is numerically computed, and the plot of E B jEg vs. 
BT is shown in Fig. 3*34c* Note that 90.28% of the total energy of the pulse g(r) is 
contained within the band B — l/T Hz. Therefore, by the 90% criterion, the bandwidth 
of a rectangular pulse of width T seconds is 1 (T Hz, 


3,7,4 Energy of Modulated Signals 

We have seen that modulation shifts the signal spectrum Gif) to the left and right by/o- We 
now show that a similar thing happens to the ESD of the modulated signal. 

Let g(t) be a baseband signal band-limited to B Hz* The amplitude-modulated signal 
tp(t) is 


(p(t) = g(t) cos 2 ^fyt 

and the spectrum (Fourier transform) of <p(t) is 

*(/ ? ) = ^[G(f+A) + GCf-/ ) )] 

The ESD of the modulated signal <p(t) is 1$ (/*)]% that is, 

%<f) = \\G(f+fo) + G(f-fo )\ 2 

If/b > B, then G(f — U:) and G if - /n.> are nonoverlapping (see Fig. 3.35), and 

1 ' 


%<f) = 


\G(f +/o)|" + \G(f —/o)T 




(3.70) 
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Figure 3+35 

Energy spectral 
densities of 
modulating and 
modulated 
signals. 



The ESDs of both g{t) and the modulated signal <p{t) are shown in Fig. 3*35* It is clear that 
modulation shifts the ESD of g(t) by dt/o* Observe that the area under ^(/) is half the area 
under 4^ (/)* Because the energy of a signal is proportional to the area under its ESD, it follows 
that the energy of <p(t) is half the energy of g(t) t that is, 

E^ ] ~E S f 0 >B (3.71) 

It may seem surprising that a signal (p(t ), which appears so energetic in comparison to g(0, 
should have only half the energy of g(t )* Appearances are deceiving, as usual. The energy of 
a signal is proportional to the square of its amplitude, and higher amplitudes contribute more 
energy. Signal g{f) remains at higher amplitude levels most of the time. On the other hand, 
<p{t), because of the factor cos 2jr/or, dips to zero amplitude levels many times, which reduces 
its energy, 

3.7.5 Time Autocorrelation Function and 
the Energy Spectral Density 

In Chapter 2, we showed that a good measure of comparing two signals #{/) and z(t) is the 
cross-correlation function ^(r) defined in Eq. (2*46)* We also defined the correlation of a 
signal #(0 with itself [the autocorrelation function ^(t)] in Eq. (2.47)* For a real signal g(0, 
the autocorrelation function ^(r) is given by* 

/ tX) 

g(Os(* + Odr (3.72a) 

■00 


* For a complex signal £{r), we define 






g*{t)g(t + T)dt 
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Setting x = t + r in Eq. (3.72a) yields 

/ OO 

g(x)g(x - t ) dx 

-OO 

In this equation, x is a dummy variable and could be replaced by /. Thus, 

/ CO 

g(t)g(t±T)dt (3.72b) 

-DO 

This shows that for a real g(r), the autocorrelation function is an even function of r, that is, 

Mr) = iM-T) (3.72c) 

There is, in fact, a very important relationship between the autocorrelation of a signal and 
its ESD. Specifically, the autocorrelation function of a signal g(t) and its ESD Wgif) form a 
Fourier transform pair, that is, 

Thus, 


\j/ s (x) q/g(Y) 

(3.73a) 

fOQ 

= x)} = / t g {x)e-W T dx 

(3.73b) 

J —00 


i'six) = {^(f)} = / # g (f)e-W T df 

J ""OO 

(3.73c) 


Note that the Fourier transform of Eq. (3.73a) is performed with respect to r in place of t. 

We now prove that the ESD (j) = \G(f)\ 2 is the Fourier transform of the autocorrelation 
function \fr g (r ). Although the result is proved here for real signals, it is valid for complex signals 
also. Note that the autocorrelation function is a function of r, not t. Hence, its Fourier transform 
is / {x)e~^ x dx. Thus, 

F[fg(?)] = j e ~ J2 ” fT [f g(t)g(t + x)dt^dx 

= j™ 8it) [/°° S V + 0e~ j2!Tfr dT^ dt 

The inner integral is the Fourier transform of g(r +/), which is g(r) left-shifted by/. Hence, it 
is given by G(f)e^ 271 ^, in accordance with the time-shifting property in Eq. (3.32a). Therefore, 

/ OO 

g{t)e> z ^dt = G(f)G(-f) = \G(f)\ 2 

■OO 

This completes the proof that 


^(T) <=> %(f) = \G(f)\ 2 (3.74) 

A careful observation of the operation of correlation shows a close connection to con¬ 
volution* Indeed, the autocorrelation function i/ g (x) is the convolution of g(r) with g(— r) 
because 

/ OC rCO 

s(*)s[-(r -x)]dx = / g{x)g{x ~ x) dx = 

-OO 7-00 

Application of the time convolution property [Eq. (3.44)] to this equation yields Eq. (3.74). 
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ESD of the Input and the Output 

If x(t) and y(r) are the input and the corresponding output of a linear time-invariant (LTI) 
system, then 


Y(f) = H(f)X(f) 

Therefore, 

I Ylf)\ 2 = \H{f)\ 2 \X{f)\ 2 

This shows that 

%<f) = \H<f)\ 2 V x (f) (3.75) 

Thus, the output signal ESD is | H(f)\ 2 times the input signal ESD. 


3.8 SIGNAL POWER AND POWER 
SPECTRAL DENSITY 


For a power signal, a meaningful measure of its size is its power [defined in Eq. (2.4)] as the 
time average of the signal energy averaged over the infinite time interval* The power P s of a 
real-valued signal g(t) is given by 

i f 1 ’/ 1 , 

Pg = ~ / g 2 (t)dt (3.76) 

T-* 00 I J-T/2 

The signal power and the related concepts can be readily understood by defining a truncated 
signal gr(t) as 


w( r) = I * (0 1(1 - 7/2 

8TK ’ \ 0 \t\> Tf2 

The truncated signal is shown in Fig. 3.36. The integral on the right-hand side of Eq. (3.76) 
yields EgT' which is the energy of the truncated signal grit). Thus, 

Pg = Hm ^ (3.77) 

T-*oo I 

This equation describes the relationship between power and energy of nonperiodic signals. 
Understanding this relationship will be very helpful in understanding and relating all the 
power concepts to the energy concepts. Because the signal power is just the time average of 
energy, all the concepts and results of signal energy also apply to signal power if we modify 
the concepts properly by taking their time averages* 

3.8.1 Power Spectral Density (PSD) 

If the signal g(t) is a power signal, then its power is finite, and the truncated signal g T (t) is an 
energy signal as long as T is finite. If gr(t) Gr(/), then from ParsevaTs theorem, 

/ OO fGQ 

g 2 T (t)dt= / \G T (f)\ 2 df 

■00 7-00 
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Figure 3.36 

Limiting process 
in derivation of 
PSD, 



Hence, P s , the power of g(t ), is given by 


P g = lim = lim — 

r-j-oo r r-^oo r 


/ oo 

\Gr(f)\ 2 df 

-oo 


(3.78) 


As T increases, the duration of gr(0 increases, and its energy E gJ also increases proportion¬ 
ately. This means that \Grif}\ 2 also increases with T, and as T -> oo, \Grif)\ 2 also approaches 
oo. However, \ Gr(f}\ 2 must approach oo at the same rate as T because for a power signal, the 
right-hand side of Eq. (3.78) must converge. This convergence permits us to interchange the 
order of the limiting process and integration in Eq. (3.78), and we have 


/. 


00 |G>(/)| 2 

lim J df 

oo T-*tx> T 


We define the power spectral density (PSD) S.j (to) as 


(3.79) 


Consequently,* 


w = 


lim 

T^OO 


|Gr(OI 2 

T 



/ oo 

S s (f)df 

-oc 

y»oc 

2 / S g (f)df 

Jo 


(3.80) 


(3.81a) 

(3,81b) 


This result is parallel to the result [Eq. (3.69a)] for energy signals. The power is the area under 
the PSD. Observe that the PSD is the time average of the ESD of gj(t) [Eq. (3.80)]. 

As is the ease with ESD, the PSD is also a positive, real, and even function off. If g(t) is 
a voltage signal, the units of PSD are volts squared per hertz. 


+ One should be cautious in using a unilateral expression such as Pg = 2 Sg (/') df when S g (f) contains an 
impulse at the origin (a dc component). The impulse part should not be multiplied by the factor 2. 
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3.8.2 Time Autocorrelation Function of Power Signals 


The (time) autocorrelation function 7^(r) of a real power signal g(t) is defined as* 

1 f T ^ 

^>(r) = hm - / g(t)g(t - r) dt 
7^oo T J„ T 


(3.82a) 


r/2 


We can use the same argument as that used for energy signals [Eqs. (3.72b) and (3.72c)] to 
show that lZ g (j) is an even function of r, This means that for a real g(0, 


1 c r r- 

7lg(t) = lim — / g{t)g{t + r)dt (3.82b) 

r^oo T J-t/2 


and 

n s (T) = n s (-z) ( 3 . 83 ) 

For energy signals, the HSD ^ (f) is the Fourier transform of the autocorrelation function 
$ g (v). A similar result applies to power signals. We now show that for a power signal, the 
PSD S g (f) is the Fourier transform of the autocorrelation function ^(r). From Eq, (3.82b) 
and Fig. 3.36, 


1 ^ r (r) 

n g (T)= lim - / g T {t)g T {t + x)dt = lim ^£-2 (3.84) 

7—>oo 1 7—>cc 1 

Recall from the Wiener-Khintchine theorem that i/ §T (j) <=> \Gr(f)\ 2 - Hence, the Fourier 
transform of the preceding equation yields 


\G T (f)\ 2 

KAt)^ lim ' — So(f) (3,85) 

£ r->oo T s 

Although we have proved these results fora real g(t), Eqs. (3.80), (3.81a), (3.81b), and (3.85) 
are equally valid for a complex g{t ). 

The concept and relationships for signal power are parallel to those for signal energy. This 
is brought out in Table 3.3. 

Signal Power Is Its Mean Square Value 

A glance at Eq. (3.76) shows that the signal power is the time average or mean of its squared 
value. In other words P g is the mean square value of g(/}. We must remember, however, that 
this is a time mean, not a statistical mean (to be discussed in later chapters). Statistical means 
are denoted by overbars. Thus, the (statistical) mean square of a variable x is denoted by x 2 . 
To distinguish from this kind of mean, we shall use a wavy overbar to denote a time average. 

Thus, the time mean square value of g{t) will be denoted by g 2 {i). The time averages are 
conventionally denoted by angle brackets, written as (g 2 (0)* We shall, however, use the wavy 


* Fora complex g{t), we define 


Hg(T) : 


lim — 
7 


fT/2 

J-T/2' 


- x)dt ■■ 


Hm 


U 


7/2 

7/2' 
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TABLE 3.3 


E s =J™ 00 g 2 (t)dt 

= f- 0o g(t)g(.t + T)dt 

V g (f) = \G(f)\ 2 

<=> v g <f) 


lim 


i c T ^ 


T-*oqT 


L 


T/2 


g{i) dt — lim 


T^oo 


n g (T ) = „ lim ^ f—T/2 ( f + r)dt = lim 

‘ " T-*oo 7 


T-*tx> T ‘ 

S s (f) = = ^lim 

7^(r)^^(f) 

Pg = F°ooW<tf 


V<f> 


overbar notation because it is much easier to associate means with a bar on top than with 
brackets. Using this notation, we see that 

| pT/2 

Pg — £ 2 (0= lim - / g 2 (t)dt (3.86a) 

T-^oo 1 J-r /2 

Note that the rms value of a signal is the square root of its mean square value. Therefore, 

[gWJrms = (3.86b) 

From Eqs, (3.82), it is clear that for a real signal g(/}, the time autocorrelation function 
Tlgir) is the time mean of g(t)g(t ±z). Thus, 

K g (i) = g(t)g(t d= t) (3.87) 

This discussion also explains why we have been using “time autocorrelation” rather than just 
“autocorrelation”. This is to distinguish clearly the present autocorrelation function (a time 
average) from the statistical autocorrelation function (a statistical average) to be introduced in 
Chapter 9 in the context of probability theory and random processes. 

Interpretation of Power Spectral Density 

Because the PSD is the time average of the ESD of g(t), we can argue along the lines used in 
the interpretation of ESD. We can readily show that the PSD S g (f) represents the power per 
unit bandwidth (in hertz) of the spectral components at the frequency/. The amount of power 
contributed by the spectral components within the band/i to /2 is given by 

rh 

AP g = 2 S s (f)df (3.88) 

Jfi 

Autocorrelation Method: A Powerful Tool 

For a signal g(t) y the ESD, which is equal to \G(f)\ 2 t can also be found by taking the Fourier 
transform of its autocorrelation function. If the Fourier transform of a signal is enough to deter¬ 
mine its ESD, then why do we needlessly complicate our lives by talking about autocorrelation 
functions? The reason for following this alternate route is to lay a foundation for dealing with 
power signals and random signals. The Fourier transform of a power signal generally does not 
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exist. Moreover, the luxury of finding the Fourier transform is available only for deterministic 
signals, which can be described as functions of time. The random message signals that occur 
in communication problems (e,g., random binary pulse train) cannot be described as functions 
of time, and it is impossible to find their Fourier transforms. However, the autocorrelation 
function for such signals can be determined from their statistical information. This allows us 
to determine the PSD (the spectral information) of such a signal. Indeed, we may consider the 
autocorrelation approach to be the generalization of Fourier techniques to power signals and 
random signals. The following example of a random binary pulse train dramatically illustrates 
the power of this technique. 


Example 3.19 Figure 3.37a shows a random binary pulse train g{r). The pulse width is 1^/2, and one binary 
digit is transmitted every T^ seconds. A binary 1 is transmitted by the positive pulse* and a binary 
0 is transmitted by the negative pulse. The two symbols are equally likely and occur randomly. 
We shall determine the autocorrelation function, the PSD, and the essential bandwidth of this 
signal. 
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We cannot describe this signal as a function of time because the precise waveform, being 
random, is not known. We do, however, know its behavior in terms of the averages {the 
statistical information). The autocorrelation function, being an average parameter (time 
average) of the signal* is determinable from the given statistical (average) information. 
We have [Eq, (3,82a)j 


1 f r/2 

K g (z) = hni - / g(t)g(t - r) dt 
T^oo T J^t/2 

Figure 3.37b shows g(f) by solid lines and g(t — t), which is g(t) delayed by r, by dashed 
lines. To determine the integrand on the right-hand side of the preceding equation, we 
multiply g(t) with g{t — r), find the area under the product g(t)g(t - r), and divide it 
by the averaging interval T. Let there be N bits (pulses) during this interval T so that 
T — NTfr, and as T -> oo, N oo. Thus, 

] fNT b /2 

7? g (r) = lim — / g(t)g(t - T)dt 

NT b J-NT b j2 

Let us first consider the case of t < 7i/2, In this case there is an overlap (shaded region) 
between each pulse of g(r) and of g(t — t). The area under the product g(t)g(t — r) is 
Tfyf 2 - r for each pulse. Since there are N pulses during the averaging interval, the total 
area under g{t)g(t — t) is N{Tbf2 — r), and 
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Figure 3*37 

Autocorrelation 
function and 
power spectral 
density function 
of a random 
binary pulse 
train. 




(b) 





Because 7Z s (t) is an even function of r, 



(3.89a) 


as shown in Fig. 337c + 

As we increase r beyond T^j 2, there will be overlap between each pulse and its immediate 
neighbor. The two overlapping pulses are equally likely to be of the same polarity or of 
oppositepolarity.Theirproductisequallylikely to be 1 or —1 over the overlapping interval. 
On the average, half the pulse products will be 1 (positive-positive or negative-negative 
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pulse combinations), and the remaining half pulse products will be — l (positive-negative 
or negative-positive combinations). Consequently, the area under g(t)g(t — r) will be zero 
when averaged over an infinitely large time (T -> og), and 

Kg( r)=0 |r| > ^ (3.89b) 


The two parts of Eq. (3.89) show that the autocorrelation function in this case is the 
triangular function ^ A(r/7i) shown in Fig. 3.37c. The PSD is the Fourier transform of 
which is found in Example 3*13 (or Table 3.1, pair 19) as 


w 


T b . 2 

— sine 
4 



(3,90) 


The PSD is the square of the sine function, as shown in Fig. 3.37d, From the result in 
Example 3.18, we conclude that 90,28% of the area of this spectrum is contained within 
the band from 0 to 4 n/Tb rad/s, or from 0 to 2/T b Hz* Thus, the essential bandwidth may be 
taken as 2/7* Hz (assuming a 90% power criterion)* This example illustrates dramatically 
how the autocorrelation function can be used to obtain the spectral information of a 
(random) signal when conventional means of obtaining the Fourier spectrum are not 
usable* 


3.8.3 Input and Output Power Spectral Densities 

Because the PSD is a time average of ESDs, the relationship between the input and output 
signal PSDs of a linear time-invariant (LT1) system is similar to that of ESDs* Following the 
argument used for ESD (Eq. (3.75)], we can readily show that if g(0 and y(r) are the input 
and output signals of an LTI system with transfer function H (/), then 

Sj,(f) = |/f(f)| 2 5,(f) (3.91) 


Example 3.20 A noise signal ni(t) with PSD S ns (f ) = K is applied at the input of an ideal differentiator 
(Fig* 3.38a). Determine the PSD and the power of the output noise signal n 0 (t). 


Figure 3.38 

Power spectral 
densities at the 
input and the 
output of an 
ideal 

differentiator. 
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The transfer function of an ideal differentiator is M(f) = jlnf. If the noise at the 
demodulator output is n 0 (t), then from Eq. (3.91), 

KV) = \H(f)\ 1 S tli (f) = \j2nf\ 2 K 

The output PSD (/) is parabolic, as shown in Fig. 3.38c. The output noise power N 0 is 
the area under the output PSD. Therefore, 


No 


/ K{2nffdf 
J-B 



%7T 2 B 3 K 

3 


3.8.4 PSD of Modulated Signals 

Following the argument in deriving Eqs. (3.70) and (3.71) for energy signals, we can derive 
similar results for power signals by taking the time averages. We can show that for a power 
signal g(t), if 


<p(t) = g{t) cos 2nfot 

then the PSD 5V i f ) of the modulated signal <p(t) is given by 

s v (f) = ~ [s 8 (f +fo)+S g (f -A)] (3.92) 

The detailed derivation is provided in Sec. 7.8. Thus, modulation shifts the PSD of g(t ) by 
—Jh■ The power of <p(t) is half the power of git), that is, 


P 


<p 



fo>B 


(3.93) 


3.9 NUMERICAL COMPUTATION OF FOURIER 
TRANSFORM: THE DFT 

To compute Gif) , the Fourier transform of g(t), numerically, we have to use the samples of 
g(t). Moreover, we can determine Gif ) only at some finite number of frequencies. Thus, we 
can compute only samples of Gif ), For this reason, we shall now find the relationships between 
samples of g{t) and samples of Gif ). 

In numerical computations, the data must be finite. This means that the number of samples 
of #0) and Gif) must be finite. In other words, we must deal with timedimited signals. If the 
signal is not time-limited, then we need to truncate it to make its duration finite. The same is 
true of G(f). To begin, let us consider a signal g(t) of duration r seconds, starting at t = 0, 
as shown in Fig. 3.39a. However, for reasons that will become clear as we go along, we shall 
consider the duration of g(r) to be 7b, where To > r, which makes g(t) = 0 in the interval 
T < t < Tq 7 as shown in Fig. 3.39a, Clearly, this makes no difference in the computation of 
Gif). Let us take samples of g(t) at uniform intervals of T s seconds. There are a total of Nq 
samples, where 


(3.94) 
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Figure 3.39 
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Now,* 

G(0= C" g(t)e-i l7,fi dt 
Jo 

tf)-l 

= lim Y\ g(kT s )e-> 27!fiT ‘ T s (3.95) 

k =0 

Let us consider the samples of Gif) at uniform intervals of/o. If G q is the q-\h sample, that is, 
G q = G(qfo ), then from Eq. (3.95), we obtain 


Afo-1 

= ]T T s g(kT s )e~ j!l2 ^ hT!k 

k^O 


Afo-1 

= t; 

k=0 


(3,96) 


where 

- T s g(kT s f G q = Giqfo ), fto = 2xjbT s (3,97) 

Thus, Eq. (3.96) relates the samples of #0) to the samples of Gif ). In this derivation, we 
have assumed that T 5 0, In practice, it is not possible to make T$ —* 0 because this would 
increase the data enormously. We strive to make T s as small as is practicable. This will result 
in some computational error. 

We make an interesting observation from Eq. (3.96). The samples G q are periodic with a 
period of 2 jt/^o samples. This follows from Eq. (3.96), which shows that G( ?+ 23r/n 0 ) = 
Thus, only 2irf samples G q can be independent. Equation (3.96) shows that G q is determined 
by No independent values g *. Hence, for unique inverses of these equations, there can be only 
No independent sample values G q . This means that 


In other words, we have 


2 jz 2ir 2nN 0 

= 2nfoT s " 2ffAT 0 * ) 

2tt/ 0 = ^- and f 0 = 1 (3.99) 

To T 0 


+ The upper limit on the summation in Eq. (3.95) is ATq - 1 (not A/q) because the last term in the sum starts at 
(No — 1)T S and covers the area under the summand up to A^T* = T{j. 
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Thus, the spectral sampling interval/o Hz can be adjusted by a proper choice of To: the larger 
the Ti) t the smaller theft* The wisdom of selecting To > r is now clean When To is greater than 
r, we shall have several zero-valued samples g k in the interval from r to To - Thus, by increasing 
the number of zero-valued samples of we reduce/o [more closely spaced samples of G (/')], 
yielding more details of Gif). This process of reducing/o by the inclusion of zero-valued 
samples g k is known as zero padding. Also, for a given sampling interval T>, larger Tq implies 
larger Nq. Thus, by selecting a suitably large value of Wo, we can obtain samples of Gif) as 
close as possible. 

To find the inverse relationship, we multiply both sides of Eq. (3.96) by and sum 

over q as 


JV (I —l 


iVo-l 


G q e> mQ w = £ 

3=0 3=0 


/V 0 -l 


8k* 




Jt=0 




Upon interchanging the order of summation on the right-hand side, 


A'o-l Aft— I 

E <v mE2o? = E »* 

q =0 i=0 


3=0 


To find the inner sum on the right-hand side, we shall now show that 

Nq-1 


E = (o 

k =o 1 


Nq n = 0, =HVo* ±2 Wq, 

otherwise 


(3.100) 


(3.101) 


To show this, recall that QqNo = 2jt and e 1 ^ = 1 for n = 0, ±JV 0 , ±2 Nq .so that 


At#—I -Vo-1 

£ ghOok = £ 1 = Nq 


*=0 i=0 


n = Q,±N 0 , ±2N 0 ,... 


To compute the sum for other values of n, we note that the sum on the left-hand side of 
Eq. (3,101) is a geometric series with common ratio a = Therefore, its partial sum of 
the first Nq terms is 


where 


tfo-i 


k =() 


— f 




This proves Eq. (3*101). 

Itnow follows that the inner sum on the right-hand side of Eq* (3.100) is zero for k ^ m y 
and the sum is Aft when k = m. Therefore, the outer sum will have only one nonzero term 
when k = m, and it is Aftg* = N$g m . Therefore, 


i N *- ] ^ 

«* = tt E ^ = ir 

Nq ™ Nq 

3=0 


(3.102) 
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Equation (3.102) reveals the interesting fact that = gm+ This means that the 

sequence g k is also periodic with a period of No samples (representing the time duration 
NqT s = To seconds). Moreover, G q is also periodic with a period of No samples, representing 
a frequency interval N(/o = (To/T s )(Tq) = 1 /T* =f s hertz. But 1 fT s is the number of samples 
of g(r) per second. Thus, 1/T* — f is the sampling frequency (in hertz) of g(t). This means 
that Gq is No-periodic, repeating every f s Hz. Let us summarize the results derived so far. We 
have proved the discrete Fourier transform (DFT) pair 


o q 


'Vo-1 

^ Ske~ jqW 


k=t> 


8k 


JL 

N 0 


N 0 -l 

G q e> kQ " q 

q=0 


(3.103a) 


(3.103b) 


where 


8k — T$g(kTf) Gq — G(qfo) 

^2 n ^2 IT 

2xfo = — 2 7tf s = — 

To T s 


(3.104) 


To / . . 2 jt 

No = ^-=f £2o = = — 

T s fo N 0 

Both the sequences gk and G q are periodic with a period of No samples. This results in gk 
repeating with period To seconds and G q repeating with period/* — 1 fT s rad/s, or f s = 1/T* Hz 
(the sampling frequency). The sampling interval of gk is T s seconds and the sampling interval 
of G q is/o = 1 /To Hz. This is shown in Fig. 3,39c and d. For convenience, we have used the 
frequency variable/ (in hertz) rather than w (in radians per second). 

We have assumed g(t) to be time-limited to r seconds. This makes Gif ) non-band- 
limited.* Hence, the periodic repetition of the spectra G q , as shown in Fig. 3.39d, will cause 
overlapping of spectral components, resulting in error. The nature of this error, known as 
aliasing error, is explained in more detail in Chapter 6. The spectrum G q repeats every /* 
Hz. The aliasing error is reduced by increasing/*, the repetition frequency (see Fig. 3.39d). 
To summarize, the computation of G q using DFT has aliasing error when ^(f) is time-limited. 
This error can be made as small as desired by increasing the sampling frequency/* = 1/T* (or 
reducing the sampling interval T*). The aliasing error is the direct result of the nonfulfillment 
of the requirement T* -> 0 in Eq. (3.95). 

When g(f) is not time-limited, we need to truncate it to make it time-limited. This will 
cause further error in G q . This error can be reduced as much as desired by appropriately 
increasing the truncating interval To.' 

In computing the inverse Fourier transform [by using the inverse DFT in Eq. (3.103b)], 
we have similar problems. If G(f ) is band-limited, git) is not time-limited, and the periodic 
repetition of samples gk will overlap (aliasing in the time domain). We can reduce the aliasing 
error by increasing To, the period of gk (in seconds). This is equivalent to reducing the frequency 


* We can show that a signal cannot be simultaneously time-limited and band-limited. If it is one, it cannot be the 
other, and vice versa. 3 

^ The DFT relationships represent a transform in their own right, and they are exact. If, however, we identify gk and 
Cq as the samples of a signal g(r) and its Fourier transform Gif), respectively, then the DFT relationships are 
approximations because of the aliasing and truncating effects. 
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sampling interval fy — l/7o of Gif ')■ Moreover, if Gif) is not band-limited, we need to truncate 
it This will cause an additional error in the computation of g k , By increasing the truncation 
bandwidth, we can reduce this error. In practice, (tapered) window functions are often used 
for truncation 5 in order to reduce the severity of some problems caused by straight truncation 
(also known as rectangular windowing). 

Because G q is Afy-periodic, we need to determine the values of G q over any one period. 
It is customary to determine G q over the range (0, No - 1) rather than over the range 
(-Ny/2, No/2 — 1). The identical remark applies to g k . 


Choiceofr v , To.andTVo 

In DFT computation, we first need to select suitable values for No, T St and T 0+ For this purpose 
we should first decide on B , the essential bandwidth of g(t). From Fig. 3.39d, it is clear that the 
spectral overlapping (aliasing) occurs at the frequency/,/2 Hz. This spectral overlapping may 
also be viewed as the spectrum beyond/^/2 folding back at/j/2. Hence, this frequency is also 
called the folding frequency. If the folding frequency is chosen such that the spectrum G(f) is 
negligible beyond the folding frequency, aliasing (the spectral overlapping) is not significant. 
Hence, the folding frequency should at least be equal to the highest significant frequency, that 
is, the frequency beyond which Gif) is negligible. We shall call this frequency the essential 
bandwidth B (in hertz). If git) is band-limited, then clearly, its bandwidth is identical to the 
essential bandwidth. Thus, 

fs 

^>B Hz (3.105a) 

Moreover, the sampling interval T s = 1 ff s [Eq. (3.104)j, Hence, 

T * < 55 (3J05b) 

Once we pick B , we can choose 7 P according to Eq. (3.105b). Also, 


/o - ^ (3H06) 

To 

where/o is the frequency resolution [separation between samples of Gif)]. Hence, if/ 0 is 
given, we can pick 7b according to Eq, (3.106). Knowing Tq and 7/ we determine No from 



(3.107) 


In general, if the signal is time-limited, Gif) is not band-limited, and there is aliasing in 
the computation of G q . To reduce the aliasing effect, we need to increase the folding frequency; 
that is, we must reduce T s (the sampling interval) as much as is practicable. If the signal is 
band-limited, g(t) is not time-limited, and there is aliasing (overlapping) in the computation 
of gk ■ To reduce this aliasing, we need to increase 7 q, the period of gk . This results in reducing 
the frequency sampling interval fo (in hertz). In either case (reducing T$ in the time-limited 
case or increasing Tq in the band-limited case), for higher accuracy, we need to increase the 
number of samples No because Nq = 7b/7/ There are also signals that are neither time-limited 
nor band-limited. In such cases, we need to reduce 7’ ? and increase 7b. 


Points of Discontinuity 

If g(0 has a jump discontinuity at a sampling point, (he sample value should be taken as (he 
average of the values on the two sides of the discontinuity because the Fourier representation 
at a point of discontinuity converges to the average value. 
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Using the FFT Algorithm in DFT Computations 

The number of computations required in performing the DFT was dramatically reduced by 
an algorithm developed by Tukey and Cooley in 1965. 6 This algorithm, known as the fast 
Fourier transform (FFT), reduces the number of computations from something on the order 
of Nq to No logN 0 . To compute one sample G r from Eq, (3.103a), we require No complex 
multiplications and No — 1 complex additions. To compute No values of G r (r = 0, L - ■«, No — 
1), we require a total of complex multiplications and No(No - 1) complex additions. For 
large No, this can be prohibitively time-consuming, even for a very high-speed computer. The 
FFT is, thus, a lifesaver in signal processing applications. The FFT algorithm is simplified if 
we choose No to be a power of 2, although this is not necessary, in general. Details of the FFT 
can be found in any book on signal processing (e.g*, Ref. 3). 


3.10 MATLAB EXERCISES 

Computing Fourier Transforms 

In this section of computer exercises, let us consider two examples illustrating the use of DFT 
in finding the Fourier transform. We shall use MATLAB to find DFT by the FFT algorithm. 
In the first example, the signal g(t) = e~ 2r u(t) starts at r = 0. In the second example, we use 
g(0 = 11(f), which starts at t = 


COMPUTER EXAMPLE C3.1 

Use DFT (implemented by the FFT algorithm) to compute the Fourier transform of e~ 2t u(t). Plot the 
resulting Fourier spectra. 


We first determine T s and 7b* The Fourier transform of e~ 2t u{t) is \f(jlnf + 2). This 
low-pass signal is not band-limited. Let us take its essential bandwidth to be that frequency 
where \G(f)\ becomes 1% of its peak value, which occurs at/ = 0. Observe that 


|G(f)| = 


1 

V(2^/) 2 +4 


1 

2af 


2nf » 2 


Also, the peak of \G(f) \ is at/ = 0, where |G(0)| — 0,5* Hence, the essential bandwidth 
B is at/ = B y where 


1 100 

\G(f)\ ^ -— = 0.5 x 0.01 => B = -Hz 

2irB 7i 

and from Eq. (3.105b), 

T s < =0.005jt =0.0157 

" 2 B 

Let us round this value down to T s = 0*015625 second so that we have 64 samples per 
second. The second issue is to determine To* The signal is not time-limited* We need to 
truncate it at To such that g(To) C 1. We shall pick To = 4 (eight time constants of 
the signal), which yields No = Tq/T s = 256* This is a power of 2* Note that there is 
a great deal of flexibility in determining T s and To, depending on the accuracy desired 
and the computational capacity available. We could just as well have picked To = 8 and 
T s = 1 /32, yielding No = 256, although this would have given a slightly higher aliasing 
error 



124 ANALYSIS AND TRANSMISSION OF SIGNALS 


■ Because the signal has a jump discontinuity at t = 0, the first sample (at t — 0) is 0.5, 
the averages of the values on the two sides of the discontinuity. The MATLAB program, 
which implements the DFT by using the FFT algorithm is as follows: 


Ts=l/64; TO=4; NO=TO/TS; 
t=0 :TS:Ts*(NO-1); t=t'; 
g=Ts*exp(-2*t) ; 
g(1)=Ts *0.5; 

G=fft(g ); 

$[Gp,Gm]$=cart2pol($real(G),imag (G) $} ; 
k=0:N0-l; k=k'; 
w=2 *pi*k/TQ; 

subplot (211J , stem (w (1:32) ,GmU:32) ) ; 
subplot(212},stem (w (1:32),Gp(1:32)) 


Because G q is JVo-periodic, G q = G(^256) so that G 256 = Go- Hence, we need 
to plot G q over the range q = 0 to 255 (not 256) + Moreover, because of this periodicity, 
G- q = G(_ fl+ 256 ) > and the G q over the range of q — —127 to — 1 are identical to the G q over 
the range o t q = 129 to 255 . Thus, G^i 27 = G^g* G_i26 = Gbo* ■ * * > G_i = G 255 . 
Tn addition, because of the property of conjugate symmetry of the Fourier transform, 
G- q = G*, it follows that G 129 = G[ 27 , G 130 = G* 26 , . - -, G 255 = G^ Thus, the plots 
beyond q = Nq/2 (128 in this case) are not necessary for real signals (because they are 
conjugates of G q for q = 0 to 12S) + 

The plot of the Fourier spectra in Fig* 3.40 shows the samples of magnitude and phase 
of G(f) at the intervals of l/7b = 1/4 Hz or = 1*5708 rad/s. In Fig. 3.40, we have 
shown only the first 28 points (rather than all 128 points) to avoid too much crowding of 
the data. 


Figure 3.40 

Discrete Fourier 
transform of an 
exponential 
signal 

Notice that the 
horizontal axis in 
this case is eu (in 
radians per 
second). 



-ju/2 


Exact 
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In this example, we knew Gif ) beforehand and hence could make INTELLIGENT choices 
for B (or the sampling frequency /*), In practice, we generally do not know Gif) before¬ 
hand* In fact, that is the very thing we are trying to determine* In such a case, we must make 
an intelligent guess for B or/* from circumstantial evidence. We then continue reducing 
the value of T* and recomputing the transform until the result stabilizes within the desired 
number of significant digits* 


Next, we compute the Fourier transform of g{t) — 8 n(r). 


COMPUTER EXAMPLE C3.2 

Use DFT (implemented by the FFT algorithm) to compute the Fourier transform of 8 fl(0. Plot the 
resulting Fourier spectra. 

This rectangular function and its Fourier transform are shown in Fig* 3 .4 la and b* To 
determine the value of the sampling interval T*, we must first decide on the essential 
bandwidth B. From Fig* 3*41b, we see that Gif ) decays rather slowly with/* Hence, 
the essential bandwidth B is rather large. For instance, at B = 15*5 Hz (97*39 rad/s), 
Gif) = -0.1643, which is about 2% of the peak at G(0)* Hence, the essential bandwidth 
may be taken as 16 Hz. However, we shall deliberately take B — 4 for two reasons; (1) to 
show the effect of aliasing and (2) because the use of B > 4 will give an enormous number 
of samples, which cannot be conveniently displayed on a book-sized page without losing 
sight of the essentials* Thus, we shall intentionally accept approximation for the sake of 
clarifying the concepts of DFT graphically* 

The choice of B = 4 results in the sampling interval T ? = 1/2 B = 1/8. Looking 
again at the spectrum in Fig. 3*41 b, we see that the choice of the frequency resolution 
/o = 1/4 Hz is reasonable. This will give four samples in each lobe of Gif). In this case 
Tq = l/fo = 4 seconds and No = Tq/T = 32. The duration of g(t) is only 1 second. We 
must repeat it every 4 seconds (To = 4), as shown in Fig, 3.41c, and take samples every 
0.125 second. This gives us 32 samples (No = 32). Also, 


g k = T s gikT) 

= 

Since git) =8 n(r), the values of g k are 1, 0, or 0.5 (at the points of discontinuity), as 
shown in Fig. 3.41c, where for convenience, g k is shown as a function of t as well as k. 

In the derivation of the DFT, we assumed that git) begins at t — 0 (Fig* 3*39a), and 
then took No samples over the interval (0, To)* In the present case, however, git ) begins at 
— |. This difficulty is easily resolved when we realize that the DFT found by this procedure 
is actually the DFT of g k repeating periodically every To seconds* From Fig. 3.41c, it is 
clear that repeating the segment of g k over the interval from —2 to 2 seconds periodically 
is identical to repeating the segment of g k over the interval from 0 to 4 seconds. Hence, 
the DFT of the samples taken from —2 to 2 seconds is the same as that of the samples 
taken from 0 to 4 seconds* Therefore, regardless of where git) starts, we can always take 
the samples of git) and its periodic extension over the interval from 0 to To* In the present 
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Observe that the last sample is at f = 31/8, not at 4, because the signal repetition starts 
at t = 4, and the sample at t = 4 is the same as the sample at t = 0. Now, Aft = 32 and 
— 2tz/32 = tt/ 16* Therefore [see Eq. (3.103a)], 

31 

£=0 

The MATLAB program, which uses the FFT algorithm to implement this DFT equation, 
is given next. First we write a MATLAB program to generate 32 samples of gk, and then 
we compute the DFT. 


% (c32 *ird 


B=4 

; 

f0=1/4; 




Ts = 

1/ (2* 

■B) ; T0=l/f0; 




N0 = 

TO/Ts; 




k=0 

: NO ; 

k=k' ; 




for 

m=l: 

length(k) 




$ 

$ if 

k(m)$>$=0 & 

k(m)$<$=3, gk(m)=1 

; end 

$ 

$ if 

k(m)==4 & k( 

m) ==28 gk(m) 

= 0.5; 

end 

$ 

$ if 

k(m)$>$=5 & 

k(m)$<$=27. 

gk(m) = 

0; end 

$ 

$ if 

k(m) $>$=29 & 

k(m)$<$=31, 

gk (m) 

=1; end 


end 

gk=gk' ; 

Gr=fft(gk); 

subplot(211),stem (k,gk) 
subplot(212),stem (k,Gr) 


Figure 3 A Id shows the plot of G q . 

The samples G q are separated by/o = 1 /7ft Hz. In this case 7ft = 4, so the frequency 
resolution/o is | Hz, as desired. The folding frequency/i/2 = B = 4 Hz corresponds to 
q = Aft/2 = 16 . Because G q is Ao-periodic (Aft = 32), the values of G q for q — —16 to 
— 1 are the same as those for q = 16 to 31. The DFT gives us the samples of the spectrum 
Gif). 

For the sake of comparison, Fig. 3,4 Id also shows the shaded curve 8 sinc(?r/), which 
is the Fourier transform of 8 1J (r). The values of G q computed from the DFT equation show 
aliasing error, which is clearly seen by comparing the two superimposed plots* The error 
in G 2 is just about 1.3%. However, the aliasing error increases rapidly with r, For instance, 
the error in G 6 is about 12 %, and the error in G 10 is 33%. The error in G 14 is a whopping 
72%. The percent error increases rapidly near the folding frequency (r = 16) because g (0 
has a jump discontinuity, which makes Gif ) decay slowly as 1//* Hence, near the folding 
frequency, the inverted tail (due to aliasing) is very nearly equal to Gif) itself. Moreover, 
the final values are the difference between the exact and the folded values (which are very 
close to the exact values). Hence, the percent error near the folding frequency (r = 16 in 
this case) is very high, although the absolute error is very small* Clearly, for signals with 
jump discontinuities, the aliasing error near the folding frequency will always be high (in 
percentage terms), regardless of the choice of Aft. To ensure a negligible aliasing error at 
any value q , we must make sure that Aft q. This observation is valid for all signals 
with jump discontinuities. 
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Filtering 

We generally think of filtering in terms of a hard ware-oriented solution (namely, building 
a circuit with RLC components and operational amplifiers). However, filtering also has a 
software-oriented solution [a computer algorithm that yields the filtered output y(t) for a given 
input g(OJ. This can be conveniently accomplished by using the DFT. If g(t) is the signal to 
be filtered, then G q , the DFT of is found. The spectrum G q is then shaped (filtered) as 
desired by multiplying G q by H q , where H q are the samples of the filter transfer function H(f ) 
[H q — H ( 3 / 0 )J- Finally, we take the inverse DFT or (IDFT) of G q H q to obtain the filtered 
output [y* = Tj y(kT)]. This procedure is demonstrated in the following example. 

COMPUTER EXAMPLE C3.3 

The signal g(t) in Fig. 3.42a is passed through an ideal low-pass filter of transfer function //(/), shown 
in Fig. 3.42b. Use DFT to find the filter output. 



We have already found the 32-point DFT of g(t) (see Fig. 3.4 Id). Next we multiply G q by 
H q . To compute H q , we remember that in computing the 32-point DFT of g{0, we have 
used/o = 0.25. Because G q is 32-periodic, H q must also be 32~periodic with samples 
separated by 0.25 Hz. This means that H q must be repeated every 8 Hz or 16 tt rad/s (see 
Fig. 3.42c). This gives the 32 samples of H q over 0 </ < 8 as follows; 

1 0 < q < 7 and 25 < q < 31 

H q = 0 9 < q < 23 

0.5 q = 8,24 
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We multiply G q by H q and take the inverse DFT. The resulting output signal is shown in 
|j Fig. 3,42d- Table 3.4 gives a printout of g k , G q , H q ,Y q? and y k - 

| We have already found the 32-point DFT (G q ) of g(t) in Example C3,2, The MATLAB 

^ program of Example C3.2 should be saved as an m-file (e,g„ “c32,nT’). We can import 
,£ G q in the MATLAB environment by the command “c32”, Next, we generate 32-point 
t samples of H q , multiply G q by H q , and take the inverse DFT to compute y k , We can also 
* find y k by convolving g k with h k . 


c32 ; 

q= 0:3 2 ; q=q' ; 
for m-1:length(q) 

if qfm)$>$=0 & q(m}$<$-7, Hq(m)=l; end 
if qfm) $>$ = 25 & q(m}$<$=31, Hq(m)=l; end 
if q(m)$>$=9 & q{m)$<$=23, Hq(m)=0; end 
if q(m)==8 & q(m}==24, Hq(m)=0.5; end 


TABLE 3.4 


No. 

gk 

c q 


O q H q 

y\ 

0 

1 

8,000 

1 

8.000 

0.9285 

1 

1 

7,179 

1 

7.179 

1.009 

2 

1 

5,027 

1 

5,027 

1.090 

3 

1 

2,331 

1 

2.331 

0.9123 

4 

i 

0.000 

1 

0.000 

0,4847 

5 

0.5 

-1.323 

1 

-1.323 

0.08884 

6 

0 

— 1 *497 

1 

-1.497 

-0,05698 

7 

0 

-0.8616 

1 

-0.8616 

—0,01383 

8 

0 

0.000 

0.5 

0.000 

0.02933 

9 

0 

0.5803 

0 

0.000 

0.004837 

10 

0 

0.6682 

0 

0.000 

-0.01966 

11 

0 

0.3778 

0 

0,000 

-0.002156 

12 

0 

0.000 

0 

0.000 

0.01534 

13 

0 

-0.2145 

0 

0.000 

0,0009828 

14 

0 

-0.1989 

0 

0.000 

-0,01338 

15 

0 

-0.06964 

0 

0.000 

-0.0002876 

16 

0 

0.000 

0 

0,000 

0,01280 

17 

0 

-0.06964 

0 

0,000 

-0.0002876 

18 

0 

-0.1989 

0 

0,000 

-0.01338 

19 

0 

in 

rl 

o 

1 

0 

0.000 

0,0009828 

20 

0 

0.000 

0 

0.000 

0.01534 

21 

0 

0.3778 

0 

0.000 

-0,002156 

22 

0 

0.6682 

0 

0.000 

-0.01966 

23 

0 

0.5803 

0 

0,000 

0,004837 

24 

0 

0.000 

0.5 

0,000 

0.03933 

25 

0 

-0.8616 

l 

-0.8616 

-0.01383 

26 

0 

-1,497 

1 

-1,497 

-0.05698 

27 

0 

-1,323 

1 

-1.323 

0.08884 

28 

0.5 

0,000 

1 

0.000 

0,4847 

29 

1 

2.331 

1 

2.331 

0,9123 

30 

1 

5.027 

1 

5.027 

1,090 

31 

1 

7.179 

1 

7.179 

1-009 
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end 

Hq=Hq'; 

Yq-Gq.*Hq; 
yk-ifft(Yq); 
elf,stem(k,yk) 
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PROBLEMS 


3*1-1 Show that the Fourier transform of g(r) may be expressed as 


/ OC fCQ 

giOcoslirfi dt -j j g{t)$\r\2nfi dt 
-co J—oo 


Hence, show that if g(r) is an even function of t y then 


rOO 

Gif) — 2 I g(t) cos 2nft dt 

J 0 


and if g(t) is an odd function of f, then 



Hence, prove that the following. 
(fg(t)is: 

a real and even function of t 
a real and odd function of t 
an imaginary and even function of t 
a complex and even function of t 
a complex and odd function of t 


git) sin 2:xft dt 


Then Gif) is: 

a real and even function off 
an imaginary and odd function off 
an imaginary and even function off 
a complex and even function off 
a complex and odd function of f 


3,1-2 (a) Show that for a real g(f), the inverse transform, Eq. (3 + 9b), can be expressed as 



G(f)\cosl2jTft + 9 g (27Tf)]df 
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Figure P.3.1-4 


Figure P.3.1-5 


Figure P.3+1-6 


Figure P.3.1-7 


This is the trigonometric form of the (inverse) Fourier transform. 

(b) Express the Fourier integral (inverse Fourier transform) for g(t) = e~ GT u(t) in the 
trigonometric form given in part (a). 

3*1-3 If g(t) Gif), then show thatg*(f) G*(-/). 

3*1-4 From definition (3,9a), find the Fourier transforms of the signals shown in Fig, P3,1-4, 



3*1-5 From definition (3,9a), find the Fourier transforms of the signals shown in Fig, P3.1-5. 


4 


git) 


2 


1 2 

(a) 



3*1-6 From definition (3.9b), find the inverse Fourier transforms of the spectra shown in Fig, P3.1-6, 



& f >\2 


-2-i'12 
(b) 


3.1-7 From definition (3.9b), find the inverse Fourier transforms of the spectra shown in Fig. P3.1 -7. 
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Figure P.3.1-8 


3.1-8 Show that the two signals in parts (a) and (b) of Fig. P3.1 -8 are totally different in the time 
domain, despite their similarity. 



IG(/)1 

l 







0 l 

3 



(a) 


IG(/)I 




Hint: G\J > = . For part (a). Gif) = 1 ■ e . / < B. whereas for part (b). 


Gif) 


le-jn/ 1 = -j 

lffix/2 _ j 


0 </ <B 
0 >f>-B 


3.2-1 Sketch the following functions: (a) n(r/2) ;(b) A(3w/100); (c) n (r —10/8) ;(d)sinc(fftu//5); 

(e) sine [(& — 1Qtt)/5] ; (f) sine (t/5) n (f/IO jt). 


Hint: is g(^) right-shifted by a. 

3*2-2 From definition (3,9a), show that the Fourier transform of rect (r - 5) is sine (jr/)e _ 7 10jr A 

3.2- 3 From definition (3 9b) t show that the inverse Fourier transform of rect [(2 nf — 10)/ 2tt] is 

sine (jtO 

3.2- 4 Using pairs 7 and 12 (Table 3.1) show that u(t) 0.5 &(f ) + L //2jr/\ 

Hint: Add 1 to sgn (f), and see what signal comes out. 

3.2- 5 Show that cos (2xfot + 6) <==> \[&(f +h)e~> d + Sif -fo)e> e ]. 

Hint : Express cos i Irtfy: + 0) in terms of exponentials using Euler’s formula. 

3.3- 1 Apply the duality property to the appropriate pair in Table 3.1 to show that: 

(a) 0.5[3(i) + 07*0] <=► uif) 

(b) Sit + T) + S(t - T) *=► 2cos 2 xfT 

(c) S(t + T) — S(t - T) <*=>■ 2j sin 2 JtfT 

Hinf.gi-t) G(-/)andi(0 = r5(-f). 
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3*3-2 The Fourier transform of the triangular pulse g(t) in Fig. P3.3-2a is given as 

Gif) = ^ 2 ^ 2jr/ - 1 ) 

Use this information, and the time-shifting and time-scaling properties, to find the Fourier 
transforms of the signals shown in Fig. P3.3-2b-f. 

Hint: Time inversion in g(t) results in the pulse gj(f) in Fig. P3.3-2b; consequently = 
g(—f)- The pulse in Fig. P3.3-2c can be expressed asg^-TJ+g! (f-T) [the sumofg(Oandgi(f) 
both delayed by T]. Both pulses in Fig. P3.3-2d and e can be expressed as g(t - T) + g\{t + T) 
[the sum of g(f) delayed by T and g\(t) advanced by T] for some suitable choice of T. The 
pulse in Fig. P3.3-2f can be obtained by time-expanding g(f) by a factor of 2 and then delaying 
the resulting pulse by 2 seconds [or by first delaying g (t) by 1 second and then time-expanding 
by a factor of 2]. 





3*3-3 Using only the time-shifting property and Table 3T, find the Fourier transforms of the signals 
shown in Fig. P3.3-3, 


Figure P+3+3-3 



Hint: The signal in Fig. P3 + 3-3a is a sum of two shifted rectangular pulses. The signal in 
Fig, P3.3-3b is sin t [u(0 - u(t - tt)] = sin t u(t) — sin t u{t — n) = sin i u{i) 4- sin (r - n) 
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u(t - tt). The reader should verify that the addition of these two sinusoids indeed results in 
the pulse in Fig* P3.3-3b, In the same way, we can express the signal in Fig. P3,3-3c as 
cos tu(t) + sin (r - 7zfZ)it(t - jt/ 2) (verify this by sketching these signals). The signal in 
Fig, P3,3-3d is e _m [u(r) — uit - T)] = e~ at u{t) — e~ iiT e~ a ^~ T ^u(t - T), 

33-4 Use the time-shifting properly to show that if g(t) G(f )> then 

g{t + T) + g{t -T)« 2 Gif) cos 2ttJT 

This is the dual of Eq. (3.36). Use this result and pairs 17 and 19 in Table 3,1 to find the Fourier 
transforms of the signals shown in Fig. P3.3-4, 


Figure P.3.3-4 






<>l 

(b) 



t ■ ■> 


33-5 Prove the following results: 

g(r) sin 2 jt/o? ^rAGif -/o) - +/o)l 

2 / 


+ T) - go - D] G(f) sin 2 nfT 

2 ; 


Use the latter result and Table 3.1 to find the Fourier transform of the signal in Fig, P3.3-5, 


Figure P+3+3-5 



- 1 

2 3 

i 

1 

-4 -3 -2 0 

-1- 


1 

f 


33-6 The signals in Fig, P3,3-6 are modulated signals with carrier cos 10/. Find the Fourier transforms 
of these signals by using the appropriate properties of the Fourier transform and Table 3.1. Sketch 
the amplitude and phase spectra for Fig. P33-6a and b. 

Hint : These functions can be expressed in the form git) cos 2 nfot. 
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3*3-7 Use the frequency shift property and Tabic 3,1 to find the inverse Fourier transform of the spectra 
shown in Fig, P3.3-7. Notice that this time, the Fourier transform is in the qj domain. 


Figure P*3*3-7 
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A signal #{/) is band-limited to B Hz. Show that the signal g n {t) is band-limited to nB Hz. 
Hint: g 2 {t) <*=> [G(f) * 0(f)], and so on. Use the width property of convolution. 

Find the Fourier transform of the signal in Fig. P3.3-3a by three different methods: 

(a) By direct integration using the definition (3.9a), 

(b) Using only pair 17 Tabic 3,1 and the time-shifting property. 

(c) Using the time differentiation and time-shifting properties, along with the fact that 

-S(r) 1. 

Hint: 1 - cos 2x = 2 sin 2 x. 

3*3-10 The process of recovering a signal g(t) from the modulated signal g(f)cos 2 xfyt is called 
demodulation. Show that the signal g(r)cos 27rf$t can be demodulated by multiplying it by 
2 cos 2 vt fyt and passing the product through a low-pass filter of bandwidth B Hz [the bandwidth 
Assume B < /q. Hint: 2 cos 2 2 izfy — 1 + cos 4 jt/o?. Recognize that the spectrum of 

g(f)cos 4 jt/o* is centered at 2/ 0 and wall be suppressed by a low-pass filter of bandwidth B Hz, 

3*4-1 Signals #i(0 = l0 4 fT(10 4 r) and = <3(0 are applied at the inputs of the ideal low-pass 
filters H\if ) = n(//20,000)and/f 2 (/ L ) = n(f/10,000) (Fig, P3.4-1), The outputs y x (f) and 
} J 2(0 of £hese filters are multiplied to obtain the signal y(0 = y\ (f)T2(0* 

(a) Sketch Gi (/’) and C2(/'). 

(b) Sketch H } (f) and// 2 (/), 

(c) Sketch Y] (f ) and K 2 (f). 

(d) Find the bandwidths of V] {0 ? >‘2(0* and y(f)* 



33-8 


3*3-9 


3*5-1 For systems with the following impulse responses, which system is causal? 

(a) h{t) = e^ at u{£), a > 0 

(b) h{t) a > 0 

(C) h{t) = - fo), a > 0 
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(d) hit) — sinc(&0, a > 0 

(e) h(t) = sine [a it - f<>)], a > 0. 

3.5-2 Consider a filter with the transfer f unction 

_ e -k0.jikf) 2 -j2itfo 


Show that this filter is physically unrealizable by using the time domain criterion [noncausal 
//(f)] and the frequency domain (Paley-Wiener) criterion. Can this filter be made approximately 
realizable by choosing a sufficiently large Use your own (reasonable) criterion of approximate 
realizability to determine 

Hint: Use pair 22 in Table 3*1. 


3.5-3 Show that a filter with transfer function 


Htf) = 


2 a° 5 > 

(2 nf) 2 + l() 10 


is unrealizable. Can this filter be made approximately realizable by choosing a sufficiently large 
* 0 ? Use your own (reasonable) criterion of approximate realizability to determine fQ. 

Hint: Show that the impulse response is noncausal. 


3.5-4 Determine the maximum bandwidth of a signal that can be transmitted through the low-pass 
RC filter in Fig, P3.5-4 with R = 1000 and C = 10 -9 if, over this bandwidth, the amplitude 
response (gain) variation is to be within 5% and the time delay variation is to be within 2%. 


Figure P.3.5-4 
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3.5- 5 A bandpass signal git) of bandwidth B = 2000 Hz centered at/ = 10 5 Hz is passed through 

the RC filter in Fig. P3.5-4 with RC = 10“ 3 . If over the passband, a variation of less than 2% 
in amplitude response and less than l % in time delay is considered distortionless transmission, 
would git) be transmitted without distortion? Find the approximate expression for the output 
signal. 

3.6- 1 A certain channel has ideal amplitude, but nonideal phase response (Fig, P3,6- 1), given by 

m)\ = i 

&h(f) = —2nft^ — k sin 2jr/T k <£ 1 

(a) Show that y(0, the channel response to an input pulse g(f) band-limited to B Hz, is 

k 

y(t) = git - t 0 ) -\- -\g(t-t() - T)-git-tQ + T)] 

Hint: Usee-'* sin2 ^ y _j k sin 2tifT. 
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Figure P.3,6-1 


Figure P.3.6-2 


(b) Discuss how this channel will affect TDM and FDM systems from the viewpoint of 
interference among the multiplexed signals. 



3.6-2 The distortion caused by multipath transmission can be partly corrected by a tapped delay-line 
equalizer Show that if a < 1, the distortion in the multipath system in Fig, 3,31a can be 
approximately corrected if the received signal in Fig, 3.31a is passed through the tapped delay¬ 
line equalizer shown in Fig. P3.6-2, 

Hint: From Eq. (3.64a), it is clear that the equalizer filter transfer function should be Hcqif) = 

1/(1 Use the fact that 1/(1 -x) = 1 +x + x 2 -hx 3 H-if jc l to show what 

should be the tap parameters a\ to make the resulting transfer function 



Output 


3.7-1 Show that the energy of the Gaussian pulse 


g(t) = 


tjyflx 


e 'b * 1 


from direct integration is I/2f7 v / 7r, Verify this result by using ParsevaFs theorem to derive the 
energy Eg from Gif). Hint: See pair 22 in Table 3,1, Use the fact that 


/:/: 


2 2 poo 2 

e~* _v dxdy = 7t ^ j e - r dx = 

} J — o o 



Jt 

(kt)dt ~ — 
4 


3.7-2 Show that 
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Hint: Recognize that the integral is the energy of g(t) = sine (kt). Use Parseval’s theorem to 
find this energy. 


3.7-3 Generalize Parseval’s theorem to show that for real Fourier transformable signals gj(t) and 
i'2(0* 

/ OO f- OC f* CO 

g } (t)g 2 (t)dt = / G { (-f)G 2 (f)df= / G } (f)G 2 (^f)df 

-OO J—O O j— DC 


3*7*4 Show that 


/: 


sine (2 7tBt — nm) sine (2itBt — nn) dt = 


0 in /: n 

IB m = n 


Hint: Recognize that 


; {2nBt — kn) = sine 2 itB ^ ^ n, 


Use this fact and the result in Prob, 3.7-2 to show that 

roo 


/ OQ 1 - 

sine (2 nBt - mn) sine {Infix - nn)dt = —r / ^ 

-oo 4fi- J-B 


The desired result follows from this integral. 


3*7-5 For the signal 


g(t) = 


2a 
-f a 2 


determine the essential bandwidth B Hz of g(t) such that the energy contained in the spectral 
components of ^(0 of frequencies below B Hz is 99% of the signal energy Eg, 


Hint : Determine Gif ) by applying the duality property [Eq. (3.26)] to pair 3 of Table 3.1. 

3.7-6 A low-pass signal g(t) is applied to a squaring device. The squarer output is applied to a 
unity gain ideal low-pass filter of bandwidth A f Hz (Fig, P3.7-6). Show that if Af is very' small 
(A f —► 0), the filter output is a dc signal of amplitude 2 E s A/, where Eg is the energy of g(t). 


Hint: The output y(r) is a dc signal because its spectrum Y(f) is concentrated at/ = 0 from 
-Af to Af with Af ->■ 0 (impulse at the origin). If g 2 (t) A(f ), and y(f) Y(f) f then 
Y(f) ^ [2A(0)A/]S(O. Now, show that Eg — A(0). 


Figure P.3.7-6 
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3.8-1 Show that the autocorrelation function of g(r) = C cos(2t rftf + i9o) is given by 7Zg{r) = 
(C 2 /2) cos 2^/or, and the corresponding PSD is Sgif) = (C 2 f4)i&(f -/q) + &{f -t-/o)h 
Hence, show that for a signal y(f) given by 

OG 

>’(*) = Co + y] C n cos {nlnfot + 6 n ) 
h=1 
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the autocorrelation function and the PSD are given by 


TZy(T) = Cq- + - £ Cn 2 COS n27Zf)T 


l 

Sy(f) = Co 2 S(f) + - £ Cn 2 m - n/o) + S(f + n/o)l 

n= 1 

Hint. Show that if g(0 = Ji( 0 +S 2 ( 0 .lhen TZ g (t) = TZ gl {r)+n g2 if)+Tlg l gi(r)+'R g2 g y (r), 
where = linir^ x (1 :T) Ct~ 2 .?] ( ; >S2 : ’ + x)dt. If gj(0 and 52 ! 1 represent any 

two of the infinite terms in >'(f), then show that 7l g]g2 (t) = 'R g2 g \ (r) = 0, To show this, use 
the fact that the area under any sinusoid over a very large time interval is at most equal to the 
area of the half-cycle of the sinusoid, 

3*8-2 The random binary signal x(r) shown in Fig, P3>8-2 transmits one digit every 7^ seconds, A 
binary 1 is transmitted byapulse/?(r) of width 7^/2 and amplitude A; a binary 0 is transmitted by 
no pulse. The digits 1 and 0 are equally likely and occur randomly. Determine the autocorrelation 
function 1Z x (j) and the PSD S x if ). 


Figure P.3.8-2 



3*8-3 Find the mean square value (or power) of the output voltage > (f) of the RC network shown 
in Fig, P3.5-4 with RC — In if the input voltage PSD S x (f) is given by (a) K ; (b) TKirf ); 
(c) [&(f + 1) + $(f — l)] r In each case calculate the power (mean square value) of the input 
signal x(f)- 

3.8-4 Find the mean square value (or power) of the output voltage y (r) of the system shown in Fig. P3.8- 
4 if the input voltage PSD S x (f ) = n (jr/). Calculate the power (mean square value) of the 
input signal jr(f). 

Figure P.3.8-4 




A AMPLITUDE MODULATIONS 
^4 AND DEMODULATIONS 


M odulation often refers to a process that moves the message signal into a specific 
frequency band that is dictated by the physical channel (e,g. voiceband telephone 
modems). Modulation provides a number of advantages mentioned in Chapter 1 
including ease of RF transmission and frequency division multiplexing. Modulations can be 
analog or digital. Though traditional communication systems such as AM/FM radios and NTSC 
television signals are based on analog modulations, more recent systems such as 2G and 3G 
cellphones, HDTV, and DSL are all digital. 

In this chapter and the next, we will focus on the classic analog modulations: amplitude 
modulation and angle modulation. Before we begin our discussion of different analog modula¬ 
tions, it is important to distinguish between communication systems that do not use modulation 
(baseband communications) and systems that use modulation (carrier communications). 


4.1 BASEBAND VERSUS CARRIER 
COMMUNICATIONS 

The term baseband is used to designate the frequency band of the original message signal 
from the source or the input transducer (see Fig. 1.2). In telephony, the baseband is the audio 
band {band of voice signals) of 0 to 33 kHz. In NTSC television, the video baseband is 
the video band occupying 0 to 43 MHz, For digital data or pulse code modulation (PCM) 
that uses bipolar signaling at a rate of Rt pulses per second, the baseband is approximately 
0 to Rb Hz, 

Baseband Communications 

In baseband communication, message signals are directly transmitted without any modification. 
Because most baseband signals such as audio and video contain significant low-frequency 
content, they cannot be effectively transmitted over radio (wireless) links. Instead, dedicated 
user channels such as twisted pairs of copper wires and coaxial cables are assigned to each 
user for long-distance communications. Because baseband signals have overlapping bands, 
they would interfere severely if sharing a common channel. Thus, baseband communications 
leave much of the channel spectrum unused. By modulating several baseband signals and 
shifting their spectra to nonoverlapping bands, many users can share one channel by utilizing 
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most of the available bandwidth through frequency division multiplexing (FDM), Long-haul 
communication over a radio link also requires modulation to shift the signal spectrum to 
higher frequencies in order to enable efficient power radiation using antennas of reasonable 
dimensions. Yet another use of modulation is to exchange transmission bandwidth for better 
performance against interferences* 

Carrier Modulations 

Communication that uses modulation to shift the frequency spectrum of a signal is known as 
carrier communication. In terms of analog modulation, one of the basic parameters (ampli¬ 
tude, frequency, or phase) of a sinusoidal carrier of high frequency/^ Hz (or co c = 2nf c rad/s) 
is varied linearly with the baseband signal m(t )♦ This results in amplitude modulation (AM), 
frequency modulation (FM), or phase modulation (PM), respectively. Amplitude modulation 
is linear, while the latter two types of carrier modulation are similar and nonlinear, often known 
collectively as angle modulation, 

A comment about pulse-modulated signals [pulse amplitude modulation (PAM), pulse 
width modulation (PWM), pulse position modulation (PPM), pulse code modulation (PCM), 
and delta modulation (DM)1 is in order here. Despite the term modulation , these signals are 
baseband digital signals* “Modulation” is used here not in the sense of frequency or band 
shifting. Rather, in these cases it is in fact describing digital pulse coding schemes used to 
represent the original analog signals. In other words, the analog message signal is modulating 
parameters of a digital pulse train. These signals can still modulate a carrier in order to shift 
their spectra. 

Amplitude Modulations and Angle Modulations 

We denote as m{t) the source message signal that is to be transmitted by the sender to its 
receivers; its Fourier transform is denoted as M (/)* To move the frequency response of m(t) 
to a new frequency band centered at f c Hz, we begin by noting that the Fourier transform has 
already revealed a very strong property known as th e frequency shifting property to achieve 
this goal. In other words, all we need to do is to multiply m(t) by a sinusoid of frequency^ 
such that 

si(0 = m(t) cos 2 nf c t 

This immediately achieves the basic aim of modulation by moving the signal frequency content 
to be centered at ±f c via 


Si(f)= l -M(f-f c )+ ] -M(f +/■) 

This simple multiplication is in fact allowing changes in the amplitude of the sinusoid ^(f) 
to be proportional to the message signal. This method is indeed a very valuable modulation 
known as amplitude modulation* 

More broadly, consider a sinusoidal signal 

s(t) = 4(r) cos -f- 0(0] 

There are three variables in a sinusoid: amplitude, (instantaneous) frequency, and phase. Indeed, 
the message signal can be used to modulate any one of these three parameters to allow .v(0 to 
carry the information from the transmitter to the receiver: 

Amplitude A(0 linearly varies with m(t) <==> amplitude modulation 
Frequency linearly varies with m(t) *==> frequency modulation 
Phase 0(0 linearly varies with /rc(0 <=> phase modulation 
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These are known, respectively, as amplitude modulation, frequency modulation, and phase 
modulation. In this chapter, we describe various forms of amplitude modulation in practical 
communication systems. Amplitude modulations are linear, and their analysis in the time 
and frequency domains is simpler. In Chapter 5, wc will separately discuss nonlinear angle 
modulations. 

The Interchangeable Use off and <o 

In Chapter 3, we noted the equivalence of frequency response denoted by frequency / with 
angular frequency to. Each of these two notations has its own advantages and disadvantages. 
After the examples and problems of Chapter 3, readers should be familiar and comfortable 
with the use of either notation. Thus, from this point on, we will use the two different notations 
interchangeably, selecting one or the other on the basis of notational or graphical simplicity. 


4.2 DOUBLE-SIDEBAND AMPLITUDE MODULATION 

Amplitude modulation is characterized by an information-bearing carrier amplitude A(r) that is 
a linear function of the baseband (message) signal m(/). At the same time, the carrier frequency 
cd c and the phase 6 C remain constant. We can assume 0 C ~ 0 without loss of generality. If the 
carrier amplitude A is made directly proportional to the modulating signal m(t), then modulated 
signal is m(t) cos co c t (Fig. 4.1). As we saw earlier [Eq. (3.36)], this type of modulation simply 
shifts the spectrum of mit) to the carrier frequency (Fig. 44 a). Thus, if 

m{t) M (/) 

then 

m(t) cos 2nf c t <=> \[M (f +f c ) +M(f - f c )'\ (4.1) 

Recall that M (f — f c ) is M(f) shifted to the right by f c , and M(f +f c ) is M(f) shifted to 
the left by /. Thus, the process of modulation shifts the spectrum of the modulating signal to 
the left and to the right by /. Note also that if the bandwidth of m(t) is B Hz, then, as seen 
from Fig. 44c, the modulated signal now has bandwidth of 2 B Hz. We also observe that the 
modulated signal spectrum centered at ±f c (or ±to c in rad/s) consists of tw o parts; a portion that 
lies outside ±/, known as the upper sideband (USB), and a portion that lies inside ±/, known 
as the lower sideband (LSB). We can also see from Fig. 4.1c that, unless the message signal 
M(f) has an impulse at zero frequency, the modulated signal in this scheme does not contain 
a discrete component of the carrier frequency /. In other words, the modulation process does 
not introduce a sinusoid at /. For this reason it is called double-sideband suppressed carrier 
(DSB-SC) modulation/ 

The relationship of B to / is of interest. Figure 4.1c shows that/. > B, thus avoiding 
overlap of the modulated spectra centered at / and -/♦ If/ < B , then the two copies of 
message spectra overlap and the information of m(t) is lost during modulation, which makes 
it impossible to get back m(t) from the modulated signal m(t) cos co c t. 

Note that practical factors may impose additional restrictions on/♦ For instance, in broad¬ 
cast applications, a transmit antenna can radiate only a narrow band without distortion. This 
means that to avoid distortion caused by the transmit antenna, we must have///? » E The 


+ The term suppressed currier does not necessarily mean absence of the spectrum at the carrier frequency/. It 
means that there is no discrete component of the carrier frequency. This implies that the spectrum of the DSB-SC 
does not have impulses at ±/> which also implies that the modulated signal m(t) cos 2nf c t does not contain a term 
of the form k cos 2?r/f [assuming that m(f) has a zero mean value). 
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spectrum shown in Fig. 4,1 d, which contains the desired baseband spectrum plus an unwanted 
spectrum at ±2 \f c . The latter can be suppressed by a low-pass filter. Thus, demodulation, which 
is almost identical to modulation, consists of multiplication of the incoming modulated signal 
m(/)cos o> c t by a carrier cos (o c t followed by a low-pass filter, as shown in Fig. 4.1e. We 
can verify this conclusion directly in the time domain by observing that the signal e{t) in 
Fig* 4Te is 


e{r) = m(t) cos 1 o) c t 

= +m(r)cos 2w t f] 

Therefore, the Fourier transform of the signal e{t) is 


(4.2a) 


£(/) = + ~[M(f + 2 f c ) + M(f — 2f c )] (4.2b) 

This analysis shows that the signal e(t) consists of two components {\/2)m(t) and 
(l/2)wi(f)cos 2 ay c t, with their non overlapping spectra as shown in Fig, 4*ld* The spectrum 
of the second component, being a modulated signal with carrier frequency 2/ 0 is centered at 
=b2/ 0 Hence, this component is suppressed by the low-pass filter in Fig. 4.1 e. The desired com¬ 
ponent (1/2)Af (/), being a low-pass spectrum (centered at/ =0), passes through the filter 
unharmed, resulting in the output (1 /2)m(0- A possible form of low pass filter characteristics 
is shown (under the dotted line) in Fig* 4. Id. The filter leads to a distortionless demodulation of 
the message signal m(f) from the DSB-SC signal. We can get rid of the inconvenient fraction 
1/2 in the output by using a carrier 2cos co c t instead of cos a> c t. in fact, later on, we shall 
often use this strategy, which does not affect general conclusions. 

This method of recovering the baseband signal is called synchronous detection, or coher¬ 
ent detection* where we use a carrier of exactly the same frequency (and phase) as the carrier 
used for modulation. Thus, for demodulation, we need to generate a local carrier at the receiver 
in frequency and phase coherence (synchronism) with the carrier used at the modulator* 


Example 4* 1 For a baseband signal 

m(t) = cos io m t — cos 2izf m t , 

find the DSB-SC signal, and sketch its spectrum. Identify the USB and LSB. Verify that the 
DSB-SC modulated signal can be demodulated by the demodulator in Fig. 4.1e, 


I 


The case in this example is referred to as tone modulation because the modulating signal 
is a pure sinusoid, or tone, cos oj m t. To clarify the basic concepts of DSB-SC modulation, 
we shall work this problem in the frequency domain as well as the time domain* In 
the frequency domain approach, we work with the signal spectra. The spectrum of the 
baseband signal m(t) = cos o) m t is given by 

= -/*)+«(/+/*)] 

= jt[6(w ™ o> m ) + S(o) + oj m )] 
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5 X The message spectrum consists of two impulses located at ±f mt as shown in Fig. 4.2a. 
The DSB-SC (modulated) spectrum, as seen from Eq. (4.1), is the baseband spectrum in 
Fig. 4.2a shifted to the right and the left by/ f (times one-half), as shown in Fig. 4.2b. 
\ This spectrum consists of impulses at angular frequencies ±( f c — f m ) and ±(f c 

The spectrum beyond f c is the USB, and the one below f c is the LSB. Observe that the 
DSB-SC spectrum does not have the component of the carrier frequency/^ This is why 
C it is called suppressed carrier. 

£ In the time domain approach, we work directly with signals in the time domain. For the 

% baseband signal m(t) = cos , the DSB-SC signal <£dsb-sc( 0 is 

(pD SB-SC:(0 = m(f) COS tO c t 

3 ™ cos ^rcos a) c t 

$ = -;[cos (a> c + a> m )t + cos {a> c - a> m )t\ 

P This shows that when the baseband (message) signal is a single sinusoid of frequency / m , 
'M the modulated signal consists of two sinusoids: the component of frequency f c 4 - f m (the 
^ USB) and the component of frequency— f m (the LSB). Figure 4.2b shows precisely the 
H spectrum of ^ DSB _ SC (0- Thus, each component of frequency f m in the modulating signal 
t% turns into two components of frequencies/ r -F/ m and/ r — f m in the modulated signal. Note 
fr the curious fact that there is no component of the carrier frequency f c on the right-hand 
side of the preceding equation. As mentioned, this is why it is called double-sideband 
suppressed carrier (DSB-SC) modulation. 

S' We now verify that the modulated signal ^dsb-sc( 0 = cos w m t cos a> c t, when applied 
to the input of the demodulator in Fig. 4.1e, yields the output proportional to the desired 
baseband signal cos eo m t. The signal e{t) in Fig. 4.1e is given by 

4 

"•C y 

g e(/) = COS (i> m t COS 0) c t 

^ 1 

^ = - cos a) m t (1 + cos 2 a> c t) 

s 

Figure 4.2 % mf) 

Example of 

DSB-SC 

modulation. 
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The spectrum of the term cos &) m t cos 2oj c t is centered at 2a> c and will be suppressed by 
the low-pass filter, yielding i cos o> m t as the output. We can also derive this result in the 
frequency domain. Demodulation causes the spectrum in Fig. 4.2b to shift left and right by 
a> c (and to be multiplied by one-half). This results in the spectrum shown in Fig. 4.2c. The 
low-pass filter suppresses the spectrum centered at ±2co c , yielding the spectrum 


Modulators 

Modulation can be achieved in several ways. We shall discuss some important categories of 
modulators. 

Multiplier Modulators: Here modulation is achieved directly by multiplying m(t) with 
cos w c t, using an analog multiplier whose output is proportional to the product of two input 
signals. Typically, such a multiplier may be obtained from a van able-gain amplifier in which 
the gain parameter (such as the ft of a transistor) is controlled by one of the signals, say, m(/). 
When the signal cos <o c t is applied at the input of this amplifier, the output is proportional to 
m(t) cos a> c t . 

In the early days, multiplication of two signals over a sizable dynamic range was a chal¬ 
lenge to circuit designers. However, as semiconductor technologies continued to advance, 
signal multiplication ceased to be a major concern. Still, we will present several classical mod¬ 
ulators that avoid the use of multipliers. Studying these modulators can provide unique insight 
and an excellent opportunity to pick up some new signal analysis skills. 

Nonlinear Modulators: Modulation can also be achieved by using nonlinear devices, 
such as a semiconductor diode or a transistor. Figure 4.3 shows one possible scheme, which 
uses two identical nonlinear elements (boxes marked NL). 

Let the input-output characteristics of either of the nonlinear elements be approximated 
by a power series 


Figure 4,3 

Nonlinear 

DSB-SC 

modulator. 


>(0 = ax(t) + bx 2 (t) (4.3) 

where v(r) and y(r) are the input and the output, respectively, of the nonlinear element. The 
summer output z(t) in Fig. 4.3 is given by 

z(0 = }’: (0 - >’2(0 = [a*i(() 4- bx\ 2 (t)] - [ 0 x 2 ( 1 ) + bx 2 2 (t)] 

Substituting the two inputs*! (0 = cos (o c t + m(r) and*j(r) = cos a> c t — m(t) in this equation 
yields 

z(t) = 2 a ■ m(t) 4- 4 b ■ m(t) cos a> c t 

The spectrum of m(t) is centered at the origin, whereas the spectrum of m(t) cos co c t is centered 
at ±q> c . Consequently, when z(t) is passed through a bandpass filter tuned to co c , the signal 
am(t) is suppressed and the desired modulated signal 4 bm(i) cos a> c t can pass through the 
system without distortion. 
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In this circuit there are two inputs: m{t) and cos oj c t. The output of the last summer, z(t)< 
no longer contains one of the inputs, the carrier signal cos co c t. Consequently, the carrier signal 
does not appear at the input of the final bandpass filter. The circuit acts as a balanced bridge 
for one of the inputs (the carrier). Circuits that have this characteristic are called balanced 
circuits. The nonlinear modulator in Fig. 4.3 is an example of a class of modulators known as 
balanced modulators ♦ This circuit is balanced with respect to only one input (the carrier); the 
other input m(r) still appears at the final bandpass filter, which must reject it. For this reason, it 
is called a single balanced modulator t A circuit balanced with respect to both inputs is called 
a double balanced modulator, of which the ring modulator (see later: Fig. 4.6) is an example. 

Switching Modulators: The multiplication operation required for modulation can be 
replaced by a simpler switching operation if we realize that a modulated signal can be obtained 
by multiplying m(t) not only by a pure sinusoid but by any periodic signal 0(f) of the fun¬ 
damental radian frequency Such a periodic signal can be expressed by a trigonometric 
Fourier series as 


CO 

0(0 = ^2 ^ cos ( na)t4 + ^ (4.4a) 

P !=0 


Hence, 


CO 

= ^2 Ci rn{i) cos ( noj c t -h 0„) (4,4b) 

n=0 


This shows that the spectrum of the product is the spectrum M(co) shifted to 

±2^,..., ±nu> c , -If this signal is passed through a bandpass filter of bandwidth 

28 Hz and tuned to o> c , then we get the desired modulated signal c\m{t) cos (w c t + 0]).* 

The square pulse train w(f) in Fig. 4.4b is a periodic signal whose Fourier series was found 
earlier (by rewriting the results of Example 2,4) as 


"(0 = 2 


§(■ 


1 1 

cos w c t — - cos 3 a) c t + — cos 5 aj c t 


The signal m(r)w(f) is given by 

1 


2 f 1 1 

m{t)w{t) = -mit) H— m(f)cos <o c t — -m (/) cos 3 co c t 4- -tn(t) cos 5 (o c t — 

2 iz |_ 3 5 


(4.5) 


(4.6) 


The signal consists not only of the component m(f) but also of an infinite 

number of modulated signals with carrier frequencies , 3 5a) c .Therefore, the spec¬ 

trum of m(f)H'(f) consists of multiple copies of the message spectrum M(f ), shifted to 
0, ±f c , ±3j/t> ±5f c , * * * (with decreasing relative weights), as shown in Fig, 4,4c. 

For modulation, we are interested in extracting the modulated component m(t ) cos a> c t 
only. To separate this component from the rest of the crowd, we pass the signal m{t )u(f) through 
a bandpass filter of bandwidth 28 Hz (or 4 irB rad/s), centered at the frequency ±f c . Provided 
the carrier frequency/^ > IB (or a> c > 4 tzB), this will suppress all the spectral components 
not centered at ±f c to yield the desired modulated signal (2 jiz)m(t) cos a> c t (Fig, 4.4d). 

We now see the real payoff of this method. Multiplication of a signal by a square pulse train 
is in reality a switching operation in which the signal m(t) is switched on and off periodically; it 


The phase Q\ is not important. 
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Figure 4.4 

Switching 
modulator for 
DSB’SC. 






can be accomplished by simple switching elements controlled by w(f)- Figure 4.5a shows one 
such electronic switch, the diode-bridge modulator, driven by a sinusoid A cos a) c t to produce 
the switching action. Diodes D\ , D 2 and D 3 , Z > 4 are matched pairs. When the signal cos a> c t is 
of a polarity that will make terminal c positive with respect to d , all the diodes conduct. Because 
diodes D\ and D 2 are matched, terminals a and b have the same potential and are effectively 
shorted. During the next half-cycle, terminal d is positive with respect to c , and all four diodes 
open, thus opening terminals a and b. The diode bridge in Fig. 4.5a, therefore, serves as a 
desired electronic switch, where terminals a and b open and close periodically with carrier 
frequency f c when a sinusoid A cos o> c t is applied across terminals c and d . To obtain the signal 
we may place this electronic switch (terminals a and b) in series (Fig. 4 . 5 b) or across 
(in parallel) m(f), as shown in Fig. 4.5c. These modulators are known as the series-bridge 
diode modulator and the shunt-bridge diode modulator, respectively. This switching on 
and off of m(t) repeats for each cycle of the carrier, resulting in the switched signal m(f)u/( 0 , 
which when bandpass-filtered, yields the desired modulated signal {2fn)m{t) cos co c t. 

Another switching modulator, known as the ring modulator, is shown in Fig. 4.6a. During 
the positive half-cycles of the carrier, diodes D\ and D 3 conduct, and D 2 and D 4 are open. 
Hence, terminal a is connected to c, and terminal b is connected to d. During the negative 
half-cycles of the carrier, diodes D\ and D 3 are open, and D 2 and Z ) 4 are conducting, thus 
connecting terminal a to d and terminal b to c. Hence, the output is proportional to m{t ) during 
the positive half-cycle and to -m(t) during the negative half-cycle. In effect, m(t) is multiplied 
by a square pulse train wo(f), as shown in Fig. 4.6b. The Fourier series for w ( )(r) can be found 
by using the signal of Eq. (4.5) to yield wo(0 = 2w(t) — 1. Therefore, we can use the 
Fourier series of w(f) [Eq. (4.5)] to determine the Fourier series of wo(0 as 

4 / 1 1 

m (t) = — cos a) c t - - cos 3a\ t + - cos 5 oj c t - 

7t \ 3 5 




(4.7a) 















150 


AMPLITUDE MODULATIONS AND DEMODULATIONS 


Hence, we have 


Vi(t) = m{t)wo(t) = ^ ^m(r) cos a> c t - ^m(t) cos 3 a> c t + ^m(t) cos 5 w c t - j ( 4 , 7 b) 

The signal m(/)w 0 (0 is shown in Fig* 4,6d, When this waveform is passed through a bandpass 
filter tuned to w c (Fig* 4,6a), the filter output will be the desired signal (4/jr)m(r) cos w c t. 

In this circuit there are two inputs: m(t) and cos w c t. The input to the final bandpass filter 
does not contain either of these inputs* Consequently, this circuit is an example of a double 
balanced modulator 


Example 4,2 Frequency Mixer or Converter 

We shall analyze a frequency mixer, or frequency converter, used to change the carrier 
frequency of a modulated signal m(t) cos w c t from to another frequency wj* 


Figure 4.7 

Frequency mixer 
or converter. 


This can be done by multiplying m(t) cos w c t by 2 cos awf, where tu mix = w c + w f or 
a) c — o>/, and then bandpass-filtering the product, as shown in Fig. 4.7a. 


n?(t) cos co € i x(t) 

-*___ rA ___ 

Bandpass 
filter 
tuned to 

m{t) cos c Oft 

1 

j 

2 cos (a 

r 

-—>- 

r 

>c ± <»l)t 

(a) 




2 co c 



2w c -j-Wj 


The product x{t) is 

x(t) = 2m(t)cos w c t cos o> m i X f 

t7i(f)[C0S ((^mix)t COS (fr^ Wrnix)f] 


If we select = w t — w iy then 


x(t) = m(f)[cos coft -f cos ( 2 w c — w /)r] 

If we select oj nux — w c -F wj, then 

x(t) = m(r)[cos ojjt H- cos (2w c 4 - oj[)t] 

In either case* as long as (o c — co/ > 2nB and w\ > 2nB y the various spectra in Fig. 4.7b 
will not overlap. Consequently* a bandpass filter at the output, tuned to o>/, will pass the 
term m(t) cos wit and suppress the other term, yielding the output m(t) cos wit. Thus, 
the carrier frequency has been translated to a>i from w c . 
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fg The operation of frequency mixing/conversion (also known as heterodyning) is basically a 
shifting of spectra by an additional aj miX . This is equivalent to the operation of modulation 
with a modulating earner frequency (the mixer oscillator frequency aj m j X ) that differs from 
% the incoming carrier frequency by w/. Any one of the modulators discussed earlier can be 
% used for frequency mixing. When we select the local carrier frequency = a> c -f 
the operation is called upconversion, and when we select a) mix — w c - coj , the operation 
is downconversion. 


Demodulation of DSB-SC Signals 

As discussed earlier, demodulation of a DSB-SC signal essentially involves multiplication by 
the carrier signal and is identical to modulation (see Fig* 4.1 )* At the receiver, we multiply the 
incoming signal by a local carrier of frequency and phase in synchronism with the incoming 
carrier. The product is then passed through a low-pass filter* The only difference between the 
modulator and the demodulator lies in the input signal and the output filter. In the modulator, 
message m(t) is the input while the multiplier output is passed through a bandpass filter tuned 
to whereas in the demodulator, the DSB-SC signal is the input while the multiplier output 
is passed through a low-pass fitter* Therefore, all the modulators discussed earlier without 
multipliers can also be used as demodulators, provided the bandpass filters at the output are 
replaced by low-pass filters of bandwidth B. 

For demodulation, the receiver must generate a carrier in phase and frequency synchro¬ 
nism with the incoming carrier. These demodulators are synonymously called synchronous 
or coherent (also homodyne) demodulators. 


Example 4*3 Analyze the switching demodulator that uses the electronic switch (diode bridge) in Fig. 4.5a 
as a switch (either in series or in parallel). 


n 

& 


I 
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I 

1 
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The input signal is m(t) cos a> t -t. The carrier causes the periodic switching on and off of the 
input signal* Therefore, theoutput is m(t) cos a) t: t x w(f). Using the identity cos x cos y = 
0.5[cos (* + y) + cos (x - y)], we obtain 


w(f)cos a> c t x w(r) = m(r)cos co c t 


ruif 

|_2 7T V 


COS (I)rt — — COS 3 CO c t + 

3 


')] 


— —?n(/) cos^ Qi c t + terms of the form m{t) cos noj c t 

it 


— m(t) + — m(t) cos 2 Qi c t + terms of the form m(t) cos noj c t 

It 71 


Spectra of the terms of the form m(t) cos noj c t are centered at ±nw c and are filtered out 
by the low-pass filter, yielding the output (1 jn It is left as an exercise for the reader 
to show that the output of the ring circuit in Fig, 4.6a operating as a demodulator (with 
the low-pass filter at the output) is (2iit)m(t) (twice that of the switching demodulator in 
this example)* 


4.3 AMPLITUDE MODULATION (AM) 


In the last section, we began our discussion of amplitude modulation by introducing the DSB- 
SC amplitude modulation because it is easy to understand and to analyze in both the time 
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and frequency domains- However, analytical simplicity does not always equate to simplicity 
in practical implementation. The (coherent) demodulation of a DSB-SC signal requires the 
receiver to possess a carrier signal that is synchronized with the incoming carrier. This require¬ 
ment is not easy to achieve in practice. Because the modulated signal may have traveled 
hundreds of miles and could even suffer from some unknown frequency shift, the bandpass 
received signal in fact has the form of 

r(t) = A c m(t - to)cos [{o> c + A aj)(t - r 0 )j = A c m(t - fy)cos [{<o c + Am)1 - 0 d ) 1 
in which Am represents the Doppler effect while 


e d = (m c -f A oS)t d 

comes from the unknown delay A). To utilize the coherent demodulator, the receiver must be 
sophisticated enough to generate a local oscillator cos [{m € + Aa >)t - 0 d )] purely from the 
received signal r{t ). Such a receiver would be harder to implement and could be quite costly. 
This cost is particularly to be avoided in broadcasting systems, which have many receivers for 
every transmitter 

The alternative to a coherent demodulator is for the transmitter to send a carrier A cos o) c t 
[along with the modulated signal cos M c t] so that there is no need to generate a carrier 
at the receiver In this case the transmitter needs to transmit at a much higher power level, 
which increases its cost as a trade-off. In point-to-point communications, where there is one 
transmitter for every receiver, substantial complexity in the receiver system can be justified, 
provided its cost is offset by a less expensive transmitter On the other hand, for a broadcast 
system with a huge number of receivers for each transmitter, it is more economical to have 
one expensive high-power transmitter and simpler, less expensive receivers because any cost 
saving at the receiver is multiplied by the number of receiver units. For this reason, broadcasting 
systems tend to favor the trade-off by migrating cost from the (many) receivers to the (fewer) 
transmitters. 

The second option of transmitting a carrier along with the modulated signal is the obvious 
choice in broadcasting because of its desirable trade-offs. This leads to the so-called AM 
(amplitude modulation), in which the transmitted signal <^am(0 is given by 

<0 am(O — A cos M c t + m{t) qos w c t (4.8a) 

-[A- b m(t)] cos co c t (4.8b) 

The spectrum of #?am( 0 is basically the same as that of ^dsb-sc( 0 = wi(/) cos M c t except 
for the two additional impulses at ±f Ct 

Pam(0 <=> \[M{f +f c ) + M(f -f c )] + ~[S(f +f c ) + S(f -f c )] (4.8c) 

Upon comparing £>am(0 with <Pd$b-sc(0 = m(t) cos M c t y it is clear that the AM signal is 
identical to the DSB-SC signal with A -f m(t) as the modulating signal [instead of m(f)]. The 
value of A is always chosen to be positive. Therefore, to sketch <^am (0* we sketch the envelope 
\A + m(t) \ and its mirror image — \A + m{r)| and fill in between with the sinusoid of the carrier 
frequency f c . The size of A affects the time domain envelope of the modulated signal. 

TWo cases are considered in Fig. 4.8. In the first case, A is large enough that A + m(t) > 0 
is always nonnegative. In the second case, A is not large enough to satisfy this condition. In 
the first case, the envelope has the same shape as m{t) (although riding on a dc of magnitude 
A). In the second case, the envelope shape differs from the shape of m{t) because the negative 



Figure 4.8 

AM signal and 
its envelope. 
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m{t) 



A A- mil) > 0 for all i 



part of A A- tfj(f) is rectified. This means we can detect the desired signal m{t) by detecting the 
envelope in the first case when A A- m(f) > 0. Such detection is not possible in the second case. 
We shall see that envelope detection is an extremely simple and inexpensive operation, which 
does not require generation of a local carrier for the demodulation. But as seen earlier, the AM 
envelope has the information about m{t) only if the AM signal [A + m(0] cos co c t satisfies the 
condition A + m(t) > 0 for all t. 

Let us now be more precise about the definition of “envelope.” Consider a signal 
£(/) cos oi c t. If E(t) varies slowly in comparison with the sinusoidal carrier cos co c t, then the 
envelope of E{t) cos oj c t is ]£{0I- This means [see Eq, (4.8b)] that if and only if A A~m(t) > 0 
for all f, the envelope of <pam(0 is 


| A + m(f)| = A + m(r) 

In other words, for envelope detection to properly detect two conditions must be met: 

(a) f c 3> bandwidth of m{t) 

(b) A + m(t) > 0 

This conclusion is readily verified from Fig. 4,8d and e. In Fig. 4.8d, where A 4- m(t) > 0, 
A + m{t ) is indeed the envelope, and m(t) can be recovered from this envelope. In Fig. 4.8e, 
where A + m{t) is not always positive, the envelope |A + m(f)| is rectified from A + m{t), 
and m{t) cannot be recovered from the envelope. Consequently, demodulation of <?am( 0 in 
Fig. 4.8d amounts to simple envelope detection. Thus, the condition for envelope detection 
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of an AM signal is 


A + m(t) > 0 for all t (4.9a) 

If m{t) > 0 for all r, then A = 0 already satislies condition (4.9a). In this case there is no need 
to add any carrier because the envelope of the DSB-SC signal m{t) cos oj c t is m(t) and such a 
DSB-SC signal can be detected by envelope detection. Tn the following discussion we assume 
that mil) ^ 0 for all t\ that is, m(t) can be negative over some range of (. 

Message Signals m{t) with Zero Offset: Let ±m p be the maximum and the minimum 
values of m(t), respectively (see Fig. 4.8). This means that m(t) > -m p . Hence, the condition 
of envelope detection (4.9a) is equivalent to 

A > -m min (4.9b) 

Thus, the minimum carrier amplitude required for the viability of envelope detection is m p . 
This is quite clear from Fig. 4.8, We define the modulation index \i as 

m p 

f*= (440a) 

For envelope detection to be distortionless, the condition is A > m p . Hence, it follows that 

0 < \l < 1 (440b) 

is the required condition for the distortionless demodulation of AM by an envelope detector. 
When A < m p , Eq. (440a) shows that > 1 (overmodulation). In this case, the option of 

envelope detection is no longer viable. We then need to use synchronous demodulation. Note 
that synchronous demodulation can be used for any value of /r, since the demodulator will 
recover signal A + m(t). Only an additional dc block is needed to remove the DC voltage A. 
The envelope detector, which is considerably simpler and less expensive than the synchronous 
detector, can be used only for fx < 1. 

Message Signals m(t ) with Nonzero Offset: On rare occasions, the message signal 
m(t) will have a nonzero offset such that its maximum and its minimum are not 
symmetric, that is, 

^min ^ — 

In this case, it can be recognized that any offset to the envelope does not change the shape of 
the envelope detector output. In fact, one should note that constant offset does not carry any 
fresh information. 

In this case, envelope detection would still remain distortionless if 


0 < m < 1 

with a modified modulation index definition of 

m min 

2A + W-max + tttmin 


(4.11a) 


(441b) 


Example 4.4 Sketch <pam( 0 for modulation indices of fx = 0.5 and /i = 1, when m(t) = b cos co m t. This 
case is referred to as tone modulation because the modulating signal is a pure sinusoid (or 
tone). 
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Figure 4.9 

Tone-modulated 
AM, |a) p = 0,5, 
|b) v = 1. 


In this case, m max = b and = —b. Hence the modulation index according to Eq, 
(4,10a) is 

b-(-b) _ b 

M - 2A + b + (-b) ~ A 


Hence, b = \uA and 


m(t) — be os io m t = f.iA cos o) fU t 


Therefore, 


<Pam(0 = [A + m(f)] cos o) c t = All 4- ju.cos oj m t\ cos o> c t 
Figure 4.9 shows the modulated signals corresponding to/i — 0,5 and fi — 1, respectively. 

M = 0.5 /u = I 



Sideband and Carrier Power 

The advantage of envelope detection in AM comes at a price. In AM, the carrier term does not 
carry any information, and hence, the carrier power is wasteful from this point of view: 

^am(0 = Acos o> t r + m(f)cos o> c t 
carrier sidebands 

The carrier power P c is the mean square value of A cos a> c t, which is A 2 / 2. The sideband 
power P s is the power of m(t) cos o> c t, which is 0.5 m 2 {t) [see Eq. (3.93)]. Hence, 

^2 j 

Pc= — and P s = - m 2 (t ) 

The useful message information resides in the sideband power, whereas the carrier power is 
the used for convenience in modulation and demodulation. The total power is the sum of the 
carrier (wasted) power and the sideband (useful) power. Hence, ??, the power efficiency, is 

_ useful power _ P s _ m 2 (t ) 1fVW 

^ total power P c + P s 

For the special case of tone modulation. 


A 2 + m 2 (t) 


_ {llA y 

mHt) = - 


m(t) = txA cos co m t 


and 


2 
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Hence 


r) = 




2 + P‘ 


100 % 


with the condition that 0 < p < L. It can be seen that t} increases monotonically with p, and 
Vim occurs at p = 1, for which 


f/max — 33% 


Thus, for tone modulation, under the best conditions (jjl = 1), only one-third of the transmitted 
power is used for carrying messages* For practical signals, the efficiency is even worse—on the 
order of 25% or lower—compared with the DSB-SC case. The best condition implies p =. 1. 
Smaller values of p degrade efficiency further. For this reason, volume compression and peak 
limiting are commonly used in AM to ensure that full modulation {p — 1 ) is maintained most 
of the time. 


Example 4.5 


Determine rj and the percentage of the total power carried by the sidebands of the AM wave 
for tone modulation when p = 0.5 and when p — 0.3. 


For p = 0.5, 


p 2 (0.5) 2 

r 100% = , A . 100 % =1U1 


2 -h p? 2 + (0*5) 2 

Hence, only about 11% of the total power is in the sidebands. For p — 0*3, 


(0.3) 2 


r 100% = 4*3% 


2 + (0.3) 2 

Hence, only 43% of the total power is in the sidebands that contain the message signal. 


Generation of AM Signals 

In principle, the generation of AM signals is identical to that of the DSB-SC modulations 
discussed in Sec. 4.2 except that an additional carrier component A cos <D c t needs to be added 
to the DSB-SC signal. 


Demodulation of AM Signals 

Like DSB-SC signals, the AM signal can be demodulated coherently by a locally generated 
earner. Coherent, or synchronous, demodulation of AM, however, defeats the purpose of AM 
because it does not take advantage of the additional carrier component A cos a> c t. As we have 
seen earlier, in the case of p < l, the envelope of the AM signal follows the message signal 
ttt(r). Hence, we shall consider here two noncoherent methods of AM demodulation under the 
condition of 0 < p < 1: rectifier detection and envelope detection. 

Rectifier Detector : If an AM s ignal i s applied to a d iode and a res i stor circu it (Fi g * 4,10), 
the negative part of the AM wave will be removed. The output across the resistor is a half-wave- 
rectified version of the AM signal. Visually, the diode acts like a pair of scissors by cutting off 
any negative half-cycle of the modulated sinusoid. In essence, at the rectifier output, the AM 
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signal is multiplied by w(0- Hence, the half-wave-recti tied output v R (t) is 


= {[A + m(f)]cos ^1^(0 


= [A + m{t)\ cos a> c t - + ™ f cos — - cos 3 o> c t + - cos 5co c t — ' ■ * ) 
j_2 n \ 3 5 /_ 


= — \A + m(r}] + other terms of higher frequencies 

t r 


When v R {t) is applied to a low-pass filter of cutoff B Hz, the output is [A + m(f)]/;r, and all 
the other terms in v R of frequencies higher than B Hz are suppressed. The dc term A/tt may 
be blocked by a capacitor (Fig. 4.10) to give the desired output The output can be 

doubled by using a full-wave rectifier. 

It is interesting to note that because of the multiplication with ;v(f), rectifier detection is in 
effect synchronous detection performed without using a local carrier. The high carrier content 
in AM ensures that its zero crossings are periodic and the information about the frequency and 
phase of the carrier at the transmitter is built in to the AM signal itself. 


Envelope Detector: The output of an envelope detector follows the envelope of the 
modulated signal. The simple circuit shown in Fig* 4*1 la functions as an envelope detector. 
On the positive cycle of the input signal, the input grows and may exceed the charged voltage 
on the capacity v c (r), turning on the diode and allowing the capacitor C to charge up to the 
peak voltage of the input signal cycle* As the input signal falls below this peak value, it falls 
quickly below the capacitor voltage (which is very nearly the peak voltage), thus causing the 
diode to open. The capacitor now discharges through the resistor R at a slow rate (with a time 
constant RC). During the next positive cycle, the same drama repeats. As the input signal rises 
above the capacitor voltage, the diode conducts again. The capacitor again charges to the peak 
value of this (new) cycle. The capacitor discharges slowly during the cutoff period. 

During each positive cycle, the capacitor charges up to the peak voltage of the input signal 
and then decays slowly until the next positive cycle as shown in Fig* 4.11 b. The output voltage 
vc(t) y thus, closely follows the (rising) envelope of the input AM signal* Equally important, 
the slow capacity discharge via the resistor R allows the capacity voltage to follow a declining 
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Figure 4.11 

Envelope 
detector for AM r 



Envelope derector output 



envelope. Capacitor discharge between positive peaks causes a ripple signal of frequency a> c 
in the output- This ripple can be reduced by choosing a larger time constant RC so that the 
capacitor discharges very little between the positive peaks (RC > J /<o c ) t Picking RC too 
large, however, would make it impossible for the capacitor voltage to follow a fast-dec lining 
envelope (see Fig. 4.11b). Because the maximum rate of AM envelope decline is dominated 
by the bandwidth B of the message signal m(t ), the design criterion of RC should be 

1 f(o c RC < \/(2 ttB) or 2 j tB < — co c 

RC 

The envelope detector output is v c (r) =A + m(t) with a ripple of frequency a> c . The dc term 
A can be blocked out by a capacitor or a simple RC high-pass filter. The ripple may be reduced 
further by another (low-pass) RC filter. 


4.4 BANDWIDTH-EFFICIENT AMPLITUDE 
MODULATIONS 

As seen from Fig. 4.12, the DSB spectrum (including suppressed earner and AM) has two 
sidebands; the upper sideband (USB) and the lower sideband (LSB), each containing the 
complete information of the baseband signal As a result, for a baseband signal m(r) with 
bandwidth B Hz, DSB modulations require twice the radio-frequency bandwidth to transmit. 
To improve the spectral efficiency of amplitude modulation, there exist two basic schemes to 
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either utilize or remove the 100% spectral redundancy: 

♦ Single-sideband (SSB) modulation, which removes either the LSB or the USB that uses only 
bandwidth of B Hz for one message signal m(t); 

* Quadrature amplitude modulation (QAM), which utilizes the spectral redundancy by sending 
two messages over the same bandwidth of 2 B Hz. 


Amplitude Modulation: Single Sideband (SSB) 

As shown in Fig- 4*13, either the LSB or the USB can be suppressed from the DSB signal 
via bandpass filtering. Such a scheme in which only one sideband is transmitted is known as 
single-sideband (SSB) transmission, and requires only one-half the bandwidth of the DSB 
signal. 


Figure 4.12 

(a| Original 

message 

spectrum, (b) The 

redundant 

bandwidth 

consumption in 

DSB 

modulations. 



Figure 4.13 

SSB spectra from 
suppressing one 
DSB sideband. 





(a) Baseband 



(b) DSB 


(c) USB 


(d) LSB 



-Vc 



(e) 
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An SSB signal can be coherently (synchronously) demodulated just like DSB-SC signals. 
For example, multiplication of a USB signal (Fig. 4.13c) by cos co r t shifts its spectrum to the 
left and right by yielding the spectrum in Fig. 4.13e. Low-pass filtering of this signal yields 
the desired baseband signal. The case is similar with LSB signals. Since the demodulation of 
SSB signals is identical to that of DSB-SC signals, the transmitters can now utilize only half 
the DSB-SC signal bandwidth without any additional cost to the receivers. Since no additional 
carrier accompanies the modulated SSB signal, the resulting modulator outputs are known as 
suppressed carrier signals (SSB-SC). 

Hilbert TVansform 

We now introduce for later use a new tool known as the Hilbert transform. We use x^(t) and 
H {x(0} to denote the Hilbert transform of signal x(J) 

**(/) = H[.*(f)} = - / (4.15) 

it r - « 

Observe that the right-hand side of Eq. (4.15) has the form of a convolution 

1 

x(t) * — 

7lt 

Now, application of the duality property to pair 12 of Table 3.1 yields 1/jrf <=> —j sgn (/). 
Hence, application of the time convolution property to the convolution (of Eq. (4.15) yields 

W) = -;*(/) sgn (/) (4.16) 

From Eq. (4.16). it follows that if m{t) passes through a transfer function H(f ) = 
—j sgn (/), then the output is m h {t ), the Hilbert transform of m{t). Because 


= sgn (/) 

_ | —j = 1 ■ f > 0 

“j j = 1 * f < 0 


(4.17) 

(4-18) 


it follows that |tf(/)| = 1 and that &h(f) = —tt/2 for/ > 0 and tt/2 for/ < 0, as shown in 
Fig. 4.14. Thus, if we change the phase of every component of m(t) by jt/ 2 (without changing 
its amplitude), the resulting signal is m/j(r), the Hilbert transform of m(r). Therefore, a Hilbert 
transformer is an ideal phase shifter that shifts the phase of every spectral component by — tt/2. 


Figure 4.14 

Transfer function 
of an ideal jt/2 
phase shifter 
(Hilbert 
transformer). 
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Figure 4.15 
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Time Domain Representation of SSB Signals 

Because the building blocks of an SSB signal are the sidebands, we shall first obtain a time 
domain expression for each sideband* 

Figure 4.15a shows the message spectrum M(f). Figure 4.15b shows its right half M+{/), 
and Fig, 4.15c shows its left half From Fig. 4.15b and c, we observe that 

M + (/) = M (/) ■ «(/} = 1 + sgn(/)] = ~ [M (/) +JM h (f)] (4.19a) 

AM/) = = M(/)l [1 - sgn(/)] = X - [M(f) - jM h (f )] (4.19b) 

We can now express the SSB signal in terms of m(t) and m^it )* From Fig* 4*15d it is clear 
that the USB spectrum <frusB(/) can be expressed as 


^usb(/) = Af+(/ - f c ) -f AM/ +fc) 

=/ [M(/ -f c )+M(f +/ c )l - i [A Of +/ c )] 

2 2 / 
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From the frequency-shifting property, the inverse transform of this equation yields 

<Pusb(0 = m(t)cos oj c t — m>j(r)sin oj c t (4,20a) 

Similarly, we can show that 

<Plsb(0 = m(t)c os oj c t + m h {t) sin a> c t (4.20b) 

Hence, a general SSB signal ^ssb( 0 can be expressed as 

<PSSb( 0 = wi(0cos a> c t =f= m h (t ) sin a> c t (4.20c) 

where the minus sign applies to USB and the plus sign applies to LSB. 

Given the time domain expression of SSB-SC signals, we can now confirm analytically 
(instead of graphically) that SSB-SC signals can be coherently demodulated: 

^ssb(0 cos gv* = [w(r)cos ^ t r=FW/ J (r)sin^ t f]2cos a> c t 
= #*(0[lTcos 2o) c t] sin 2 oj c t 

= m(t) + [m(t) cos 7xa c t ^ m h {t) sin 2 a> c t] 

■' v -.——i— * 

SSB-SC signal with carrier 2oj c 

Thus, the product ^ssb( 0 ■ 2 cos aj c t yields the baseband signal and another SSB signal 
with a carrier 2o) c . The spectrum in Fig. 4.13e shows precisely this result. A low-pass filter 
will suppress the unwanted SSB terms, giving the desired baseband signal m(f). Hence, the 
demodulator is identical to the synchronous demodulator used for DSB-SC. Thus, any one of 
the synchronous DSB-SC demodulators discussed earlier in Sec. 4.2 can be used to demodulate 
an SSB-SC signal. 


Example 4.6 Tone Modulation: SSB 

Find v?ssb(0 for a simple case of a tone modulation, that is, when the modulating signal is a 
sinusoid m(t) = cos oj m t. Also demonstrate the coherent demodulation of this SSB signal. 

I Recall that the Hilbert transform delays the phase of each spectral component by jt/2. 
In the present case, there is only one spectral component of frequency Delaying the 
phase of m(t) by jt/2 yields 

m k (t) = cos (co m t - ^) = sin co m t 


Hence, from Eq. (4.20c), 

P>ssb( 0 — cos to m t cos o) c t sin &> m rsin ai c t 
= COS (C0 c db a>m)t 


Thus, 


Pusb(0 = cos (q> c + co m )t and <Plsb(0 = cos Uo c - o> m )t 

To verify these results, consider the spectrum of m(r) (Fig. 4.16a) and its DSB-SC 
(Fig. 4.16b), USB (Fig, 4.16c), and LSB (Fig. 4.16d) spectra. It is evident that the spectra 
in Fig. 4.16c and d do indeed correspond to the <pusb(0 and ^lsb( 0 derived earlier. 



Figure 4.16 

$SB spectra for 
tone modulation. 
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Finally, the coherent demodulation of the SSB tone modulation is can be achieved by 

<Pssb(02cos ai c t — 2 cos (q) c ± a> m )t cos a> c t 

™ COS ti) m t + COS (ft> c -f Q)m)t 

which can be sent to a lowpass filter to retrieve the message tone cos co m t. 


SSB Modulation Systems 

Three methods are commonly used to generate SSB signals: phase shifting, selective filtering, 
and the Weaver method. 1 None of these modulation methods are precise, and all generally 
require that the baseband signal spectrum have little power near the origin. 

The phase shift method directly uses Eq. (4.20) as its basis. Figure 4.17 shows its imple¬ 
mentation. The box marked “-jt/ 2” is a phase shifter, which delays the phase of every positive 
spectral component by tz/2. Hence, it is a Hilbert transformer. Note that an ideal Hilbert phase 
shifter is unrealizable. This is because the Hilbert phase shifter requires an abrupt phase change 
of 7i at zero frequency. When the message m{t) has a dc null and very little low-frequency 
content, the practical approximation of this ideal phase shifter has almost no real effect and 
does not affect the accuracy of SSB modulation. 

Tn the selective-filtering method, the most commonly used method of generating SSB 
signals, a DSB-SC signal is passed through a sharp cutoff filter to eliminate the undesired side¬ 
band. To obtain the USB. the filter should pass all components above frequency f c unattenuated 
and completely suppress all components below f c . Such an operation requires an ideal filter, 
which is unrealizable. It can, however, be approximated closely if there is some separation 
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Figure 4,1 7 

Generating SSB 
using the phase 
shift method. 
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between the passband and the stopband. Fortunately, the voice signal provides this condition, 
because its spectrum shows little power content at the origin (Fig. 4.18a), In addition, articu¬ 
lation tests have shown that for speech signals, frequency components below 300 Hz are not 
important. In other words, we may suppress all speech components below 300 Hz (and above 
3500 Hz) without affecting intelligibility appreciably. Thus, filtering of the unwanted sideband 
becomes relatively easy for speech signals because we have a 600 Hz transition region around 
the cutoff frequency f c . To minimize adjacent channel interference, the undesired sideband 
should be attenuated at least 40 dB. 

For very high carrier frequency f c , the ratio of the gap band (600 Hz) to the carrier 
frequency may be too small, and, thus, a transition of 40 dB in amplitude over 600 Hz may 
be difficult. In such a case, a third method, known as Weaver’s method , 1 utilizes two stages 
of SSB amplitude modulation. First, the modulation is carried out by using a smaller carrier 
frequency (f C[ ), The resulting SSB signal effectively widens the gap to 2 f cx (see shaded spectra 
in Fig, 4,18b). Now by treating this signal as the new baseband signal, it is possible to achieve 
SSB-modulation at a higher carrier frequency. 
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Detection of SSB Signals with a Carrier (SSB+C) 

We now consider SSB signals with an additional carrier (SSB+C). Such a signal can be 
expressed as 

^ssb+c = A cos a) c t + [m(r) cos a> c t + mfr(t) sin a> c t] 

and m{t) can be recovered by synchronous detection [multiplying ^ssb 4 -c by cos co c t] if the 
carrier component A cos a) c t can be extracted (by narrowband filtering of) <p$sb+c- Alterna¬ 
tively, if the carrier amplitude A is large enough, m(t) can also be (approximately) recovered 
from by envelope or rectifier detection. This can be shown by rewriting <p$sb+c as 

PSSB+C = [A 4- m(f)]cos co c t + m h (t) sin oj c t 

= E(t)c os (a) c t + 0) (4.21) 


where E(t ), the envelope of ^ssb+c, is given by [see Eq. (3.41a)] 
E(t) = {[A + m(f )] 2 + 

mfa) 


-A 


1 + 


2m (r) m 2 (t) » 2 ^' nl/2 


+ 


A 2 


+ 


A 2 


If A » |m(f)j, then in general* A » |m/,(OI, and the terms m 2 (t)/A 2 and m^(r)/A 2 can be 
ignored. Thus, 


E(t)^A 



2m(r) 


1/2 


A 


Using Taylor series expansion and discarding higher order terms [because m(t)/A 1], we 
get 


E(t) ^ A 



= A + m(r) 


It is evident that for a large carrier, the SSB H- C can be demodulated by an envelope detector. 

In AM, envelope detection requires the condition A > |m(0|, whereas for SSB+C, the 
condition is A [wi(f)|. Hence, in SSB case, the required carrier amplitude is much larger 
than that in AM, and, consequently, the efficiency of SSB+C is pathetically low. 


Quadrature Amplitude Modulation (QAM) 

Because SSB-SC signals are difficult to generate accurately, quadrature amplitude modulation 
(QAM) offers an attractive alternative to SSB-SC. QAM can be exactly generated without 
requiring sharp-cutoff bandpass filters. QAM operates by transmitting two DSB signals using 
carriers of the same frequency but in phase quadrature, as shown in Fig. 4.19. This scheme is 
known as quadrature amplitude modulation (QAM) or quadrature multiplexing. 

As shown Figure 4.19, the boxes labeled -njl are phase shifters that delay the phase 
of an input sinusoid by — jt/2 rad. If the two baseband message signals for transmission are 
m\(t) and mjit), the corresponding QAM signal ^QAMU)>thesumof the two DSB-modulated 
signals, is 

<Pqam( 0 = wi(r)cos &^r + m2(0sin co c t 


This may not be true for all t, but it is true for most t. 
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Figure 4*19 

Quadrature 

amplitude 

multiplexing. 



Both modulated signals occupy the same band. Yet two baseband signals can be separated at 
the receiver by synchronous detection if two local carriers are used in phase quadrature, as 
shown in Fig. 4.19. This can be shown by considering the multiplier output Xi(f) of the upper 
arm of the receiver (Fig. 4.19): 

= 2 $? qam (0 cos a) c t = 2[m\ (f)cos u> c t + m 2 (t) sin a> c t\ cos co c t 
= w!t(0 + wii(f)cos 2co c t -b ni 2 (t) sin 2a> c t (4.22a) 

The last two terms are bandpass signals centered around 2w t . In fact, they actually form a 
QAM signal with 2oj c as the carrier frequency. They are suppressed by the low-pass filter, 
yielding the desired demodulation output mi(r)> Similarly, the output of the lower receiver 
branch can be shown to be m 2 {t). 

X 2 (0 = 2<pqam( 0 sina^r = 2[mi(f)cos <o c t + m 2(0 sin co c t] sin co c t 

= — wi 2 (f)cos 2o) c t + m\(t) sin 2a) c t (4.22b) 

Thus, two baseband signals, each of bandwidth B Hz, can be transmitted simultaneously 
over a bandwidth 2 B by using DSB transmission and quadrature multiplexing. The upper 
channel is also known as the in-phase (I) channel and the lower channel is the quadrature 
(Q) channel. Both signals mi (f) and nt 2 {t) can be separately demodulated. 

Note, however, that QAM demodulation must be totally synchronous. An error in the phase 
or the frequency of the carrier at the demodulator in QAM will result in loss and interference 
between the two channels. To show this, let the carrier at the demodulator be 2 cos {a> c t -f- 9). 
In this case, 

xi (f) = 2 [nt] (/) cos a> c t + m2(0 sin o> c t] cos (o> c t + 9) 

= mi (r) cos 9 - m 2 (t) sin 9 + mi (0 cos (2 oj c t -j- 9) + mjtf) sin (2 w c t -1- G) 

The low-pass filter suppresses the two signals modulated by carrier of angular frequency 2 co Cr 
resulting in the first demodulator output 

mi (r) cos# — m 2 (f) sin# 

Thus, in addition to the desired signal mi (f), we also receive signal m 2 (t) in the upper receiver 
branch. A similar phenomenon can be shown for the lower branch. This so-called cochanneE 
interference is undesirable. Similar difficulties arise when the local frequency is in error (see 
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Prob. 4.4-1). In addition, unequal attenuation of the USB and the LSB during transmission 
leads to cross talk or cochannel interference. 

Quadrature multiplexing is used in analog coIot television to multiplex the so-called 
chrominance signals, which carry the information about colors. There, the synchronization 
is achieved by periodic insertion of a short burst of carrier signal (called color burst in the 
transmitted signal). Digital satellite television transmission also applies QAM, 

In terms of bandwidth requirement, SSB is similar to QAM but less exacting in terms of 
the carrier frequency and phase or the requirement of a distortionless transmission medium. 
However, SSB is difficult to generate if the baseband signal m(t) has significant spectral content 
near the dc. 


4.5 AMPLITUDE MODULATIONS: VESTIGIAL 
SIDEBAND (VSB) 

As discussed earlier, it is rather difficult to generate exact SSB signals. They generally require 
that the message signal m(t) have a null around dc. A phase shifter, required in the phase shift 
method, is unrealizable, or only approximately realizable. The generation of DSB signals is 
much simpler, but it requires twice the signal bandwidth. Vestigial sideband (VSB) modula¬ 
tion, also called the asymmetric sideband system, is a compromise between DSB and SSB. It 
inherits the advantages of DSB and SSB but avoids their disadvantages at a small cost. VSB 
signals are relatively easy to generate, and, at the same time, their bandwidth is only a little 
(typically 25%) greater than that of SSB signals. 

In VSB, instead of rejecting one sideband completely (as in SSB), a gradual cutoff of one 
sideband as shown in Fig, 4.20d, is accepted. The baseband signal can be recovered exactly by 
a synchronous detector in conjunction with an appropriate equalizer filter H a (f) at the receiver 
output (Fig. 4.21). If a large carrier is transmitted along with the VSB signal, the baseband 
signal can be recovered by an envelope (or a rectifier) detector. 


Figure 4*20 

Spectra of the 
modulating 
signal and 
corresponding 
DSB, SSB, and 
VSB signals. 
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If the vestigial shaping filter that produces YSB from DSB is //,-(/) (Fig, 4.21), then the 
resulting YSB signal spectrum is 

d>vss(/) = [M(f +/c) +M(/ -fc)lHi(f) (4.23) 

This YSB shaping filter Hj(f) allows the transmission of one sideband but suppresses the 
other sideband, not completely, but gradually. This makes it easy to realize such a filter, but 
the transmission bandwidth is now somewhat higher than that of the SSB (where the other 
sideband is suppressed completely). The bandwidth of the YSB signal is typically 25 to 33% 
higher than that of the SSB signals. 

We require that m(t) be recoverable from <pvsb( 0 by using synchronous demodulation at 
the receiver This is done by multiplying the incoming YSB signal ^vsb( 0 by 2cos co c r. The 
product e{f) is given by 

e(t) = 2^vse(0cos tO c t <=> [d>vSB(/ +/c) + ^VSB (/ — fc)] 

The signal e(t) is further passed through the low-pass equalizer filter of the transfer function 
The output of the equalizer filter is required to be m(t )* Hence, the output signal 
spectrum is given by 


M{f) = [4>vsb (/ +/c) + <*>vsb(/ 

Substituting Eq. (4.23) into this equation and eliminating the spectra at ±4f c [suppressed by a 
low-pass filter H we obtain 




(4*24) 


Hence 


Ho(f) 


1 


\f\<B 


(4.2 5) 


~fc) 

Note that because //,(/) is a bandpass filter, the terms Hi(f ±f c ) contain low-pass components* 


Complementary VSB Filter and Envelope Detection of VSB + C Signals 

As a special case of a filter at the YSB modulator, we can choose Hj(f) such that 


//,(/ +/c) + M t (f -f c ) = 1 \f\<B (4.26) 

The output filter is just a simple low-pass filter with transfer function; 

= l l/l <B 


Figure 4.21 
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The resulting VSB signal plus carrier (VSB + C) can be envelope-detected. This demod¬ 
ulation method may be proved by using exactly the same argument used in proving the case 
for SSB + C signals. In particular, because of Eq. (4.26), we can define a new low-pass filter 

F(f) =j[ 1 - 2 H'(f-f c )] = -j[ 1 - 2Hi(f +/ c )] \f\<B 

Defining a new (complex) low-pass signal as 

m v (t) <=► MJf) = F(f)M(f) 
we can rewrite the VSB signal as 

*vs„</> = «(/-/.) + + «,(/-/,)-«,(/+/,) 

2 2 ; 


<PvsbU) = m{t) cos2j rf c t + m v (t) sin (4.27b) 

Clearly, both the SSB and the VSB modulated signals have the same form, with m^U) in 
SSB replaced by a low-pass signal m v (t) in VSB. Applying the same analysis from the SSB+C 
envelope detection, a large carrier addition to ^vsb( 0 would allow the envelope detection of 
VSB + C* 

We have shown that SSB+C requires a much larger carrier than DSB+C (AM) for envelope 
detection. Because VSB+C is an in-between case, the added carrier required in VSB is larger 
than that in AM, but smaller than that in SSB + C* 


Example 4*7 The carrier frequency of a certain VSB signal is^- = 20 kHz, and the baseband signal bandwidth 

is 6 kHz. The VSB shaping filter #;(/) at the input, which cuts off the lower sideband gradually 
over 2 kHz, is shown in Fig. 4*22a* Find the output filter H 0 {f) required for distortionless 
reception. 

Figure 4.22b shows the low-pass segments of //,-(/ + f c ) + ///(/ ~f c )- We are interested 
in this spectrum only over the baseband (the remaining undesired portion is suppressed 
by the output filter). This spectrum, which is 0.5 over the band of 0 to 2 kHz, is 1 from 
2 to 6 kHz, as shown in Fig. 4.22b. Figure 4,22c shows the desired output filter H 0 {f ), 
which is the reciprocal of the spectrum in Fig. 4.22b [see Eq* (4.25)]. 


Use of VSB in Broadcast Television 

VSB is a clever compromise between SSB and DSB, which makes it very attractive for 
television broadcast systems. The baseband video signal of television occupies an enormous 
bandwidth of 4.5 MHz, and a DSB signal needs a bandwidth of 9 MHz* It would seem desirable 
to use SSB to conserve bandwidth. Unfortunately, doing this creates several problems. First, 
the baseband video signal has sizable power in the low-frequency region, and consequently it 
is difficult to suppress one sideband completely. Second, for a broadcast receiver, an envelope 
detector is preferred over a synchronous one to reduce the receiver cost. We saw earlier that 
SSB+C has a very low power efficiency* Moreover, using SSB will increase the receiver cost* 
The spectral shaping of television VSBs signals can be illustrated by Fig. 4*23* The vestigial 
spectrum is controlled by two filters: the transmitter RF filter Hr(f) and the receiver RF filter 
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Figure 4,22 
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Figure 4.23 
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//?(/)♦ Jointly we have 

Hence, the design of the receiver output filter H 0 (f) follows Eq< (4-25). 

The DSB spectrum of a television signal is shown in Fig t 4 + 24a. The vestigial shaping 
filter Hi{f) cuts off the lower sideband spectrum gradually, starting at 0.75 MHz to 1.25 MHz 
below the carrier frequency/, as shown in Fig. 4.24b. The receiver output filter H 0 (f) is 
designed according to Eq. (4.25) + The resulting VSB spectrum bandwidth is 6 MHz. Compare 
this with the DSB bandwidth of 9 MHz and the SSB bandwidth of 4 + 5 MHz. 


4.6 LOCAL CARRIER SYNCHRONIZATION 

In a suppressed carrier, amplitude-modulated system (DSB-SC SSB-SC, and VSB-SC), the 
coherent receiver must generate a local carrier that is synchronous with the incoming carrier 
(frequency and phase). As discussed earlier, any discrepancy in the frequency or phase of the 
local carrier gives rise to distortion in the detector output. 
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Figure 4.24 

Television signal 
spectra: (a) DSB 
signal, (b) signal 
transmitted. 




Consider an SSB-SC case where a received signal is 

m(t) cos [(a>c + Aa>)f + 8] - ro*(f) sin [(co c 4- A co)t + 8] 

because of propagation delay and Doppler frequency shift. The local carrier remains as 
2 cos a) c t. The product of the received signal and the local carrier is e(r), given by 

e(t) = 2 cos a) c t [m(r) cos (aj c t H- A wt + 8) — m h (t) sin (<o c t + A ojt + 8)] 

= m(t) cos (A cot H- <5) - mk{t) sin (A a>t + 8) 

+ m(r) cos [(2^ H- A a))t + <5] — ntk{t) sin [(2 cu c H- Aco)t T 8] (4.28) 

^ . 

bandpass SSB-SC signal around 2 co c + Act) 

The bandpass component is filtered out by the receiver low-pass filter, leaving the output 
*o(t) as 


e 0 (t) — m(t) cos (A cot + <5) — m^t) sin (A ojt + 5) (4,29) 

If Ao) and 5 are both zero (no frequency or phase error), then 

e 0 (t) - m(t) 


as expected. 

In practice, if the radio wave travels a distance of d meters at the speed of light c, then the 
phase delay is 

8 — —(t o c + Ao))dfc 

which can be any value within the interval [-jr, -f n]. Two oscillators initially of identical 
frequency can also drift apart. Moreover, if the receiver or the transmitter is traveling at a 
velocity of v e , then the maximum Doppler frequency shift would be 

A/max = ~fc 
c 
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The velocity v e depends on the actual vehicles (e.g. spacecrafts, airplanes, cars). For example, 
if the mobile velocity v e is 108 km/ph, then for a carrier frequency at 100 MHz T the maximum 
Doppler frequency shift would be 10 Hz. Such a shift of every frequency component by a fixed 
amount Act) destroys the harmonic relationship between frequency components. For Af = 10 
Hz, the components of frequencies 1000 and 2000 Hz will be shifted to frequencies 1010 
and 2010 Hz, respectively. This upsets their harmonic relationship and the quality of nonaudio 
signals. 

It is interesting to note that audio signals are highly redundant, and unless Af is very large, 
such a change does not destroy intelligibility of the output. For audio signals Af < 30 Hz 
does not significantly affect the signal quality. Af > 30 Hz results in a sound quality similar 
to that of Donald Duck. But the intelligibility is not completely lost. 

Generally, there are two ways to recover the incoming carrier at the receiver. One way is 
for the transmitter to transmit a pilot (sinusoid) signal that can be either the exact carrier or 
directly related to the carrier (e.g., a pilot at half the carrier frequency). The pilot is separated 
at the receiver by a very narrowband filter tuned to the pilot frequency. It is amplified and used 
to synchronize the local oscillator. Another method, in which no pilot is transmitted, is for the 
receiver to use a nonlinear device to process the received signal, to generate a separate carrier 
component that can be extracted using narrow bandpass filters. Clearly, effective and narrow 
bandpass filters are very’ important to both methods. Moreover, the bandpass filter should also 
have the ability to adaptively adjust its center frequency to combat significant frequency drift 
or Doppler shift. Aside from some typical bandpass filter designs, the phase-locked loop (PLL), 
which plays an important role in carrier acquisition of various modulations, can be viewed as 
such a narrow and adaptive bandpass filter. The principles of PLL will be discussed later in 
this chapter. 


4.7 FREQUENCY DIVISION MULTIPLEXING (FDM) 

Signal multiplexing allows the transmission of several signals on the same channel. In 
Chapter 6, we shall discuss time division multiplexing (TDM), where several signals time- 
share the same channel. In FDM, several signals share the band of a channel. Each signal is 
modulated by a different earner frequency. These carriers, referred to as subcarriers, are ade¬ 
quately separated to avoid overlap (or interference) between the spectra of various modulated 
signals. Each signal may use a different kind of modulation (e.g., DSB-SC, AM, SSB-SC, 
VSB-SC, or even frequency modulation or phase modulation). The modulated signal spectra 
may be separated by a small guard band to avoid interference and facilitate signal separation 
at the receiver. 

When all the modulated spectra are added, we have a composite signal that may be 
considered to be a baseband signal to further modulate a radio-frequency (RF) carrier for the 
purpose of transmission. 

At the receiver, the incoming signal is first demodulated by the RF carrier to retrieve 
the composite baseband, which is then bandpass-filtered to separate all the modulated signals. 
Then each modulated signal is demodulated individually by an appropriate subcarrier to obtain 
all the basic baseband signals. 

One simple example of FDM is the analog telephone long-haul system. There are two 
types of long-haul telephone carrier system: the legacy analog L-carrier hierarchy systems and 
the digital T-carrier hierarchy systems in North America (or the E-carrier in Europe). 3 Both 
were standardized by the predecessor of the International Telecommunications Union known 
(before 1992) as the CC1TT (Comity Consultatif International Telephonique etTelegraphique). 
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Figure 4*25 
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We will first describe the analog telephone hierarchy that utilizes FDM and SSB modulation 
here and defer the digital hierarchy discussion until later (Chapter 6). 

In the analog L-carrier hierarchy 4 each voice channel is modulated using SSB+C. Twelve 
voice channels form a basic channel group occupying the bandwidth of 60 to 108 kHz. As 
shown in Fig. 4*25, each user channel uses LSB, and frequency division multiplexing (FDM) 
is achieved by maintaining the channel carrier separation of 4 kHz* 

Further up the hierarchy, 5 five groups form a supergroup, via FDM. Multiplexing 10 
supergroups generates a mastergroup, and multiplexing six supergroups forms a jumbo 
group, which consists of 3600 voice channels over a frequency band of 16*984 MHz in the 
L4 system* At each level of the hierarchy from the supergroup, additional frequency gaps are 
provided for interference reduction and for inserting pilot frequencies. The multiplexed signal 
can be fed into the baseband input of a microwave radio channel or directly into a coaxial 
transmission system* 


4.8 PHASE-LOCKED LOOP AND SOME 
APPLICATIONS 

Phase-Locked Loop (PLL) 

The phase-locked loop (PLL) is a very important device typically used to track the phase and 
the frequency of the carrier component of an incoming signal* it is, therefore, a useful device 
for synchronous demodulation of AM signals with a suppressed carrier or with a little carrier 
(the pilot)* It can also be used for the demodulation of angle-modulated signals, especially 
under conditions of low signal-to-noise ratio (SNR). It also has important applications in a 
number of clock recovery systems including timing recovery in digital receivers* For these 
reasons, the PLL plays a key role in nearly every modern digital and analog communication 
system. 

A PLL has three basic components: 

1, A voltage-controlled oscillator (VCO). 

2, A multiplier, serving as a phase detector (PD) or a phase comparator. 

3, A loop filter H(s). 
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Figure 4.26 
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Basic PLL Operation 

The operation of the PLL is similar to that of a feedback system (Fig. 4.26a). In a typical 
feedback system, the feedback signal tends to follow the input signal. If the feedback signal 
is not equal to the input signal, the difference (known as the error) will change the feedback 
signal until it is close to the input signal. A PLL operates on a similar principle, except that the 
quantity fed back and compared is not the amplitude, but the phase. The VCO adjusts its own 
frequency such that its frequency and phase can track those of the input signal. At this point, 
the two signals are in synchronism (except for a possible difference of a constant phase). 

The voltage-controlled oscillator (VCO) is an oscillator whose frequency can be linearly 
controlled by an input voltage. If a VCO input voltage is e 0 (t), its output is a sinusoid with 
instantaneous frequency given by 


ft>(0 4- ce 0 (t) (4.30) 

where c is a constant of the VCO and co c is the free-running frequency of the VCO [when 
e 0 (t) = 0]. The multiplier output is further low-pass-filtered by the loop filter and then applied 
to the input of the VCO. This voltage changes the frequency of the oscillator and keeps the 
loop locked by forcing the VCO output to track the phase (and hence the frequency) of the 
input sinusoid. 

If the VCO output is B cos [o> t f 4 ft(0L then its instantaneous frequency is a> c 4 0 o (t). 
Therefore, 


0 o (t)=ce o {t) (4.31) 

Note that c and B are constant parameters of the PLL. 

Let the incoming signal (input to the PLL) be A sin [<o c t 4 ft(f)]- If the incoming signal 
happens to be A sin \a> 0 t 4 ^(OL it can still be expressed as A sin [a> c t 4 ft (0], where ft (f) = 
(o) 0 — to c )t 4 ir(r). Hence, the analysis that follows is general and not restricted to equal 
frequencies of the incoming signal and the free-running VCO signal. 

The multiplier output is 


AB sin (co c t 4 ft) cos (a) c t 4 ft,) = — [sin(ft - ft) 4 sin (2 a> c t 4 ft 4 ft)] 
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The sum frequency term is suppressed by the loop filter- Hence, the effective input to the loop 
filter is \AB sin [0,(0 - If h(t) is the unit impulse response of the loop filter, 

e G {t) =h(t)* ^ AS sin fft(f) - 0 o (t) ] 

= ]-AB f h(t -x) sinlftt*) -0 o {x)]dx. (432) 

2 J {> 

Substituting Eq. (4.32) into Eq. (4.31) and letting K = ^ cB lead to 

0 o (t) = AK [ h(t - x) sin 9 e (x)dx (4.33) 

Jo 


where B e (t) is the phase error, defined as 




These equations [along with Eq t (431)] immediately suggest a model for the PLL, as shown 
in Fig. 4.26b. 

The PLL design requires careful selection of the loop filter His) and the loop gain AK. 
Different loop filters can enable the PLL to capture and track input signals with different types 
of frequency variation. On the other hand, the loop gain can affect the range of the trackable 
frequency variation. 

Small-Error PLL Analysis 

In small-error PLL analysis, sin 9 e ^ B e , and the block diagram in Fig. 4,26b reduces to the 
linear (time-invariant) system shown in Fig. 4.27a. Straightforward feedback analysis gives 

0 O (J) _ AKH(s)/s _ AKH(s) 

0;O) 1 + [AKH(s)/s] s + AKH(s) J 

Therefore, the PLL acts as a filter with transfer function AKH Cs)/[s + A AT/( .?)], as shown in 
Fig. 4.27b. The error 0 ^{j) is given by 


®*(s) = ©iO) - ©oCs) = 



® 0 (s> 

0/COJ 


®i(s) 


s 

s + AKH (s) 




(4.35) 


One of the important applications of the PLL is in the acquisition of the frequency and the 
phase for the purpose of synchronization. Let the incoming signal be A sin (wof+^o)-We wish 


Figure 4,27 
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to generate a local signal of frequency too and phase* tpo. Assuming the quiescent frequency 
of the VCO to be co €y the incoming signal can be expressed as A sin [to c t + 0,-(r)], where 

$i(t) = UOQ — CO c )t + (pQ 


and 


OJQ - OJ c W[) 

= --— 4- — 


s 

Consider the special case of H ( L s) = L Substituting this equation into Eq. (4.35)* 


®e(s) = 


s + AK 


top " ft) c vo 

s 2 s 


{0} o - £Oc)/AK (to£) - G*c)/AK <pQ 


s + AK 


s + AK 


Hence, 


BA) = l-e- AK ')+<&-** 


AK 


^1 


(4.36a) 


Observe that 


lim 9 e (t) = 

f“+OQ 


m ~ to c 

AK 


(4.36b) 


Hence, after the transient dies (in about 4/AK seconds), the phase error maintains a con¬ 
stant value of (too — o> c )/AK. This means the PLL frequency eventually equals the incoming 
frequency too- There is, however, a constant phase error. The PLL output is 


B cos 


wo t + (fo — 


toy — co c 
AK 


For a second-order PLL using 


H(s) 

&e(s) 


s + a 


®i(s) 


s + AKH(s) 
s 2 

s 2 +AK(s + a) [ J 2 


too - a> c 


+ 


?] 


the final value theorem directly yields, 6 


lim 0 e (t) = lim sS e (s) — 0 

f—^ 50 s—* 0 


(4.37a) 

(4.37b) 


(4.38) 


In this case, the PLL eventually acquires both the frequency and the phase of the incoming 
signal. 


With a difference jt/2. 
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We cun use small-error analysis, to show that a first-order loop cannot track an incoming 
signal whose instantaneous frequency varies linearly with time. Moreover, such a signal can be 
tracked within aconstant phase (constant phase error) by using a second-order loop [Eq* (437)], 
and it can be tracked with zero phase error by using a third-order loop. 7 

It must be remembered that the preceding analysis assumes a linear model, which is valid 
only when B e (t) jt/ 2. This means the frequencies coq and a> c must be very close for this 
analysis to be valid. For a general case, one must use the nonlinear model in Fig. 4.26b. For 
such an analysis, the reader is referred to Yiterbi, 7 Gardner, 8 or Lindsey. 9 

First-Order Loop Analysis 

Here we shall use the nonlinear model in Fig. 4.26b, but for the simple case of H(s) = 1. For 
this case h(t) = <5(0,* and Eq. (4.33) gives 

0 o (t)=AKsin Q e (t) 


Because 0 e = 0; — 0 o , 

e e = 8i-AK sin (4.39) 

Let us here consider the problem of frequency and phase acquisition. Let the incoming 
signal be A sin + <p$) and let the VCO have a quiescent frequency oj c . Hence, 


0/(0 = ( a K\ ~ o) c )t + 


and 


0 e = (coq — a > c ) — AK sin 0 e {t) (4.40) 

For a better understanding of PLLbehavior, we use Eq* (4.40) to sketch & e vs, $ e . Equation 
(4.40) shows that 9 e is a vertically shifted sinusoid, as shown in Fig. 4*28. To satisfy Eq* (4.40), 
the loop operation must stay along the sinusoidal trajectory shown in Fig* 4*28* When $ e = 0, 
the system is in equilibrium, because at these points, B e stops varying with time* Thus 0 e = 
01 , 02* 03 * and 04 are all equilibrium points* 

If the initial phase error 0 e (O) = 0 e o (Fig. 4*28), then 0* corresponding to this value of 0 e 
is negative. Hence, the phase error will start decreasing along the sinusoidal trajectory until it 


Figure 4.28 

Trajectory of a 
First-order PLL. 



* Actually h(t) = 2B sine (2jrBf)* where B is the bandwidth of the loop filter This is a low-pass, narrow band filter, 
which suppresses the high-frequency signal centered at 2oj c . This makes H{s) = 1 over a low-pass narrow band of B 
Hz. 
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reaches the value # 3 , where equilibrium is attained Hence, in steady state, the phase error is a 
constant 83 . This means the loop is in frequency lock; that is, the VCO frequency is now coq, 
but there is a phase error of O 3 . Note, however, that if Icejq — a> c \ > AK, there are no equilibrium 
points in Fig. 4.28, the loop never achieves lock, and 8 e continues to move along the trajectory 
forever. Hence, this simple loop can achieve phase lock provided the incoming frequency 
does not differ from the quiescent VCO frequency <o c by more than AK . 

Tn Fig. 4,28, several equilibrium points exist. Half of these points, however, are unsta¬ 
ble equilibrium points, meaning that a slight perturbation in the system state will move the 
operating point farther away from these equilibrium points. Points 8 \ and 83 are stable points 
because any small perturbation in the system state will tend to bring it back to these points. 
Consider, for example, the point 83 . If the state is perturbed along the trajectory toward the 
right, 8 e is negative, which tends to reduce 8 e and bring it back to 83 . If the operating point 
is perturbed from £3 toward the left, 8 e is positive, 8 e will tend to increase, and the operating 
point will return to 83 . On the other hand, at point 82 if the point is perturbed toward the right, 
8 e is positive, and 8 e will increase until it reaches 83 . Similarly, if at &2 the operating point is 
perturbed toward the left, 8 e is negative, and 8 e will decrease until it reaches 8 \ . Hence, 82 is 
an unstable equilibrium point. The slightest disturbance, such as noise, will dislocate it either 
to 8 \ or to 83 . In a similar way, we can show that 84 is an unstable point and that 8 \ is a stable 
equilibrium point 

The equilibrium point 83 occurs where B e — 0. Hence, from Eq. (4.40), 


83 — sin 


_i — <Oc 

AK 


If 83 jr/2, then 

coq - a> r 
83 2- - 

AK 

which agrees with our previous result of the small-error analysis [Eq. (4.36b)]. 

The first-order loop suffers from the fact that it has a constant phase error Moreover, it 
can acquire frequency lock only if the incoming frequency and the VCO quiescent frequency 
differ by not more than AK rad/s. Higher order loops overcome these disadvantages, but they 
create a new problem of stability. More detailed analysis can be found in Gardener . 8 

Generalization of PLL Behaviors 

To generalize, suppose that the loop is locked , meaning that the frequencies of both the input 
and the output sinusoids are identical. The two signals are said to be mutually phase coherent 
or in phase lock. The VCO thus tracks the frequency and the phase of the incoming signal. A 
PLL can track the incoming frequency only over a finite range of frequency shift. This rahge 
is called the hold-in or lock range. Moreover, if initially the input and output frequencies are 
not close enough, the loop may not acquire lock. The frequency range over which the input 
will cause the loop to lock is called the pull-in or capture range. Also if the input frequency 
changes too rapidly, the loop may not lock. 

If the input sinusoid is noisy, the PLL not only tracks the sinusoid, but also deans it up. The 
PLL can also be used as a frequency modulation (FM) demodulator and frequency synthesizer, 
as shown later, in the next chapter. Frequency multipliers and dividers can also be built using 
PLL. The PLL, being a relatively inexpensive integrated circuit, has become one of the most 
frequently used communication circuits. 

In space vehicles, because of the Doppler shift and oscillator drift, the frequency of the 
received signal has a lot of uncertainty. The Doppler shift of the carrier itself could be as high 
as ±75 kHz, whereas the desired modulated signal band may be just 10 Hz. To receive such a 
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signal by conventional receivers would require a filter of bandwidth 150 kHz, when the desired 
signal has a bandwidth of only 10 Hz. This would cause an undesirable increase in the received 
noise (by a factor of 15,000), since the noise power is proportional to the bandwidth. The PLL 
proves convenient here because it tracks the received frequency continuously, and the filter 
bandwidth required is only 10 Hz. 

Carrier Acquisition in DSB-SC 

We shall now discuss two methods of carrier regeneration using PLL at the receiver in DSB-SC: 
signal squaring and the Costas loop. 


Signal-Squaring Method: 

An outline of this scheme is given in Fig. 4.29. The incoming signal is squared and then passed 
through a narrow (high Q) bandpass filter tuned to 2 co c . The output of this filter is the sinusoid 
k cos 2 a> c U with some residual unwanted signal. This signal is applied to a PLL to obtain a 
cleaner sinusoid of twice the carrier frequency, which is passed through a 2:1 frequency divider 
to obtain a local carrier in phase and frequency synchronism with the incoming carrier. The 
analysis is straightforward. The squarer output x(t) is 

jc(f) = [/n{/)cos 6) c t] 2 — 4- ^m 2 (f) cos 2 o> c t 

Now m 2 {t) is a nonnegative signal, and therefore has a nonzero average value [in contrast 
to m(f), which generally has a zero average value]. Let the average value, which is the dc 
component of m 2 (f)/2, be £. We can now express m 2 (t)f 2 as 

~m 2 (t) =k + <p(t) 

where 0(f) is a zero mean baseband signal minus its dc component]. Thus, 


1 ? 1 . 

*{;) = -m (t) + -m {t)c os 2 oj c t 
1 ■> 

= -tfT(f) -b k cos 2oj c t -b 0(f) cos 2 co c t 


The bandpass filter is a narrowband (high-Q) filter tuned to frequency 2 cu c . It completely sup¬ 
presses the signal m 2 (r), whose spectrum is centered at a> — 0. It also suppresses most of the 
signal 0(r) cos 2 a> c t. This is because although this signal spectrum is centered at 2ai Cy it has 
zero (infinitesimal) power at 2^ since 0(f) has a zero dc value. Moreover this component 
is distributed over the band of 4 B Hz centered at 2a) c . Hence, very little of this signal passes 
through the narrowband filter.* In contrast, the spectrum of k cos 2a) c r consists of impulses 


* This will also explain why we cannot extract the carrier directly from m(r) cos w f; r by passing it through a 
narrowband filter centered at The reason is that the power of m(t) cos <o c t at <o c is zero because m(0 has no dc 
component [the average value of m(t) is zero]. 
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located at ±2w c . Hence, all its power is concentrated at 2 co c and will pass through. Thus, the 
filter output is k cos 2 a> c t plus a small undesired residue from cos 2This residue 
can be suppressed by using a PLL, which tracks k cos 2w c f. The PLL output, after pass¬ 
ing through a 2:1 frequency divider, yields the desired carrier. One qualification is in order. 
Because the incoming signal sign is lost in the squares we have a sign ambiguity (or phase 
ambiguity of n) in the carrier generated. This is immaterial for analog signals. For a digital base¬ 
band signal, however, the carrier sign is essential, and this method, therefore, cannot be used 
directly. 


Costas Loop: Yet another scheme for generating a local carrier, proposed by Costas, 10 
is shown in Fig* 4*30* The incoming signal is m(t) cos (a> c t 4- ft). At the receiver, a VCO 
generates the carrier cos {w c t + ft). The phase error is ft = ft - ft* Various sig¬ 
nals are indicated in Fig* 4.30. The two low-pass filters suppress high-frequency terms to 
yield m(r)cosft and m{t) sin ft, respectively. These outputs are further multiplied to give 
m 2 (t) sin 2 0 e . When this is passed through a narrowband low-pass filter, the output is R sin 2ft, 
where R is the dc component of The signal R sin 2ft is applied to the input of 

a VCO with quiescent frequency oi c . The input R sin 2ft increases the output frequency, 
which, in turn, reduces ft* This mechanism was fully discussed earlier in connection with 
Fig. 4.26. 


Carrier Acquisition in SSB-SC 

For the purpose of synchronization at the SSB receiver, one may use highly stable crystal 
oscillators, with crystals cut for the same frequency at the transmitter and the receiver. At 
very high frequencies, where even quartz crystals may not have adequate performance, a pilot 
carrier may be transmitted. These are the same methods used for DSB-SC. However, neither 
the received-signal squaring technique nor the Costas loop used in DSB-SC can be used for 
SSB-SC* This can be seen by expressing the SSB signal as 


<P$sbU) — tft(f) cos o) c t q= sin oj c t 
= E(t)cos [w c t 4- 0(f)] 
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where 


£(/) = v im 2 (t) + m 2 h (t ) 


0(t) = tan 1 


±>nh(f) 

m(t) 


Squaring this signal yields 

<Ps SB (0 = f 2 (Ocos 2 [co c t + 6(f)] 

//'j 

= — ^-{1 + COS [2co c t + 20(0]} 

The signal E 2 {i) is eliminated by a bandpass filter. Unfortunately, the remaining signal is not a 
pure sinusoid of frequency 2a) c (as was the case for DSB). There is nothing we can do to remove 
the time-varying phase 20(t) from this sinusoid. Hence, for SSB, the squaring technique does 
not work* The same argument can be used to show that the Costas loop will not work either. 
These conclusions also apply to VSB signals. 


4.9 MATLAB EXERCISES 


In this section, we provide MATLAB exercises to reinforce some of the basic concepts on 
analog modulations covered in earlier sections* We will cover examples that illustrate the 
modulation and demodulation of DSB-SC, AM, SSB-SC, and QAM. 

DSB-SC Modulation and Demodulation 

The first MATLAB program, triples inc ,m, is to generate a signal that is (almost) strictly 
band-limited and consists of three different delayed version of the sine signal: 

ift 2 (0 = 2 sine (2 t/T a ) + sinc(2t/T a + 1) + sine (2 t/T a — 1) 


% [triplesinc*m) 

% Baseband signal for AM 
% Usage m=triplesinc{t,Ta) 

function m^triplesinc(t,Ta) 

% t is the length of the signal 

% Ta is the parameter, equaling twice the delay 
% 

sig_l = sinc ( 2 *t/Ta); 
sig_2=sinc(2 * t /Ta-1) ; 
sig_3 = sinc ( 2 * t /Ta+1) ; 
m=2*sig_l+sig_2+sig_3; 

end 


The DSB-SC signal can be generated with the MATLAB file ExainpleDSB. m that gen¬ 
erates a DSB-SC signal for t= s (—0.04, 0.04).Thecarrierfrequency is 300 Hz* The original 
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message signal and the DSB-SC signal for both time and frequency domains are illustrated in 
Fig. 4.31. 


% (ExampleDSB .m) 

% This program uses triplesinc.m to illustrate DSB modulation 
% and demodulation 


ts=l.e~4 


t=-0,04:ts:0.04; 

Ta=0.01; 

m_sig=triplesinc(t,Ta); 

Lfft=length(t); Lfft=2"ceil(log2(Lfft)); 
M_fre=fftshift(fft(m_sig,Lfft)); 

£reqm= £-Lfft/2:Lfft/2-1) /(Lfft*ts) ; 

s_dsb=m_sig.*cos(2*pi*500 *t); 

Lfft=length{t); Lfft=2"ceil(log2(Lfft)+1); 
S_dsb=fftshift(fft(s_dsb,Lfft)); 
freqs=(-Lfft/2:Lfft/2-l)/(Lfffc*ts); 

Trange=[-0.03 0.03 -2 2] 
figure(1) 

subplot(221);tdl=plot(t,m_sig); 
axis(Trange) ; set(tdl f 'Linewidth' ,2) ; 
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Figure 4.32 
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xlabel('{\it t} (sec) ') ; ylabel(' {\ it m}({\it t}) ') 

subplot(223);td2=plot(t,s_dsb); 

axis (Trange J ; set (td.2 , ' Linewidth' , 2) ; 

xlabel{'{\it t} (sec)'}; ylabel{'{\it s}_{\rm DSB}({\it t })') 

Frange=[-600 600 0 200] 

subplot{222};fdl=plot( £ regm,abs(M_fre )); 
axis(Frange) ; sett fdl, r Linewidth',2]; 
xlabel('{\it f} {Hz)M; ylabel {'{\it M}({\it f})'} 
subplot(224);fd2=plot(freqs,abs{S_dsb)); 
axis(Frange); set[fd2,'Linewidth',2); 

xlabel{'{\it f} (Hz)'); ylabel('{\it S}_{ rm DSB} ({\it f})') 


The first modulation example, Examp leDSBdemf ilt .m is based on a strictly low-pass 
message signal mo(0- Next, we will generate a different message signal that is not strictly 
band-limited. In effect, the new message signal consists of two triangles: 

/*+ aoi\ /f-0.01\ 

- 4 \~6m~) ~ 4 (-aor) 

Coherent demodulation is also implemented with a finite impulse response (FIR) low-pass 
filter of order 40. The original message signal m{t), the DSB-SC signal ff?{f)cos a> c t, the 
demodulator signal e{t) = m(f)cos 2 a? c ^ and the recovered message signal m d (t ) after low- 
pass filtering are all given in Fig, 4*32 for the time domain and in Fig* 4*33 for the frequency 
domain. The low-pass filter at the demodulator has bandwidth of 150 Hz* The demodulation 
result shows almost no distortion* 
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% (ExampleDSBdemfilt.m) 

% This program uses triangl.m to illustrate DSB modulation 
% and demodulation 

ts = l*e-4; 

t=-0.04:ts:0.04; 

Ta=0.01; 

m_sig=triangl( (t + 0.01)/G.01)-trianglf(t-0.01)/0.01); 
Lm^sigslength(m_sig); 

L f f t = length(t); 

Lfft=2~ceil(log2(Lfft)); 

M_fre=fftshift{fft(m_si g, L f f t)); 
freqm={-Lfft/2:Lfft/2-l)/<Lfft*ts); 

B_m=150; ^Bandwidth of the signal is B_m Hz. 
h=firl(40,[B_m*ts]); 

t=-0- 04:ts:0.04; 

Ta=0.01;fc=300; 
s_dsb=m_sig.*cos[2*pi* fc*t); 

Df ft=length(t) ; Lfft=2"ceil(log2(Lfft)+1); 

S_dsb=fftshift(fft £s_dsb,Lfft)); 
freqs=(-Lfft/2:Lfft/2’l)/(Lfft*ts); 


% Demodulation begins by multiplying with the carrier 
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s_dem=s_dsb. *cos(2 *pi* fc*t)*2; 

S_dem=fftshift(fft(s_dem,Lfft)); 

% Using an ideal LPF with bandwidth 150 Hz 
s_rec=filter(h,1,s_dem); 

S_rec=fftshift(fft(s_rec,Lfft}); 

Trange^[-0.025 0*025 -2 2]; 
figured) 

subplot(221);tdl=plot(t,m_sig); 

axis(Trange ); set(tdl,'Linewidth',1.5); 

xlabel('{\it t) (sec)'); ylabel('£\it m)({\it t})'); 

title('message signal'); 

subplot(222);td2=plot[t„s_dsb); 

axis(Trange); set(td2,'Linewidth',1.5); 

xlabel('{\it t] (sec)'); ylabel('£\it s)_(\rm DSB}({\it t})') 

titie( 1 DSB-SC modulated signal'); 

subplot(223) ;td3=plot(t H s_dem); 

axis (Trange) ; set (tdl, ' Linewidth-' , 1.5 ) ; 

xlabel(' £\it t} (sec)'); ylabel('{\it e}({\it t}) ') 

title ( ' £ \ it e} ( { \ it t} ) ' ) ; 

subplot(224);td4=plot(t,s_rec); 

axis(Trange); set(td4,'Linewidth',1*5); 

xlabel('{\it t] (sec)'); ylabel( r £\it m}_d{(\it t })') 

title('Recovered signal'); 

Frange=[-7O0 700 0 200]; 
figure(2) 

subplot(221);fdl=plot(freqm,abs(H_fre)); 

axis(Frange); set(fdl„ 'Linewidth',1*5); 

xlabel('{\it f} (Hz) 1 ); ylabel('{\it M}({\it f))'); 

title('message spectrum'); 

subplot(222);fd2=plot(freqs,abs(£_dsb)); 

axis(Frange); set[fd2 H 'Linewidth',1.5); 

xlabel(' {\it f} (Hz)'); ylabel('{\it S}_{rm DSB}((\it f})'); 

title('DSB-SC spectrum'); 

subplot(223);fd3=plot(freqs,abs(S_dem)); 

axis(Frange); set(fd3,'Linewidth',1.5); 

xlabel('£\it f> (Hz)'}; ylabel('{\it E}({\it f})'>; 

title('spectrum of {\it e}(£\it t}) r ); 

subplot(224);fd4=plot{freqs,abs(S_rec)); 

axis (Frange) ; set (fd4, ' Linewidth M,5) ; 

xlabel ( '{\it f} (Hz) ' ) ; ylabel ( ' {\it M}_d[{\it f} ) ' ) ; 

title('recovered spectrum'); 


AM Modulation and Demodulation 

In this exercise, we generate a conventional AM signal with modulation index of = I. Using 
the same message signal mi (r), the MATLAB program Examp leAMdemf il t * m generates 
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Figure 4.34 

Time domain 
signals in AM 
modulation and 
noncoherent 
demodulation. 


Message signal 





the message signal, the corresponding AM signal, the rectified signal in noncoherent demod¬ 
ulation, and the rectified signal after passing through a low-pass filter. The low-pass filter at 
the demodulator has a bandwidth of 150 Hz* The signals in the time domain are shown in 
Fig. 4.34, whereas the corresponding frequency domain signals are shown in Fig. 4.35. 

Notice the large impulse in the frequency domain of the AM signal* The limited time 
window means that no ideal impulse is possible and only very large spikes centered at the 
carrier frequency of ±300 Hz are visible. Finally, because the message signal bandwidth is 
not strictly band-limited, the relatively low carrier frequency of 300Hz forces the low-pass 
filter at the demodulator to truncate some with the message component in the demodulator. 
Distortion near the sharp comers of the recovered signal is visible. 


% (Examp1eAMdemfi1t,m) 

% This program uses triangl.m to illustrate APT modulation 
% and demodulation 

ts=l.e-4; 
t = -0.04:ts;0 * 04; 

Ta=0.01; f c = 5 0 0; 

m_sig=triangl((t + 0.01)/0 * 01)-trianglf(t-Q.01)/0 * 01); 
Lm_sig=length fm_sig) ; 

Lfft=length(t); Lfft=2~ceil(log2(Lfft)); 

M_fre=fftshift{fft(m_sig,Lfft)); 
freqm={-Lfft/2:Lfft/2-l)/(Lfft*ts); 

B_m=150; ^Bandwidth of the signal is B_m Hz* 
h=firl(40,[B_m*ts]); 
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Figure 4.35 
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% AM signal generated by adding a carrier to DSB-SC 
s_am=(l+m_sig) -*cos(2 *pi*fc*t) ; 

Lf ft=length(t) ; Lfft=2~ceil[log2(Lfft)+1); 

S_am=fftshift(fft[s_am,Lfft)); 
freqs=(“Lfft/2:Lfft/2-1)/(Lfft*ts); 

% Demodulation begins by using a rectifier 
s__dem=s_am. * (s_am>0) ? 

S_dem=fftshift(fft(s_dem,Lfft)); 

% Using an ideal LPF with bandwidth 150 Hz 
s_rec=fliter(h,1,s_dem); 

S_rec=fftshift(fft(s_rec,Lfft)); 


Trange=[-0 * 025 0.025 -2 2] ; 
figure(1) 

subplot(221);tdl=plot(t,m_sig); 

axis(Trange); set ftdl,'Linewidth',1.5); 

xlabel('{\it t} (sec)'); ylabel('{\it m}((\it t})'>; 

title('message signal'}; 

subplot(222);td2=plot(t,s_am); 

axis(Trange); set(td2,'Linewidth',1.5); 

xlabel('{\it t} (sec)'}; ylabel( J {\it s}_{\rm DSB}({\it t})') 
title('AM modulated signal'); 
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subplot(223);td3=plot(t„s_dem); 

axis(Trange); set(td3,'Linewidth',1.5); 

xlabel{'{\it t> (sec)'); ylabel('{\it e}({\it t})') 

title('rectified signal without local carrier'); 

subplot(224);td4-plot(t,s_rec); 

Trangelow=[-0*025 0.025 -0*5 1]; 

axis(Trangelow); set(td4,'Linewidth',1.5); 

xlabel{'{\it t} (sec)'); ylabel('{\it m}_d({\it t})') 

title('detected signal'); 

Frange=[“700 700 0 200]; 
figure(2) 

subplot(221);fdl=plot(freqm,abs(M_fre)); 

axis(Frange); set(fdl,'Linewidth',1.5); 

xlabel( '{\it f) (Hz)'); ylabel('{\it M}({\it f})') ; 

title('message spectrum'); 

subplot(222};fd2=plot(fregs H abs($_am)); 

axis(Frange); set(fd2,'Linewidth',1*5); 

xlabel('(\it f} (Hz)')? ylabel['{\it S}_{rm AM}({\it f})'); 
title('AM spectrum'); 

subplot(223);fd3=plot(fregs,abs(S_dem)); 

axis(Frange); set(fd3,'Linewidth',1*5); 

xlabel('{\it f} (Hz)'); ylabel('{\it E}({\it f})'); 

title('rectified spectrum'); 

subplot(224);fd4=plot(freqs„abs(S_rec)); 

axis(Frange)? set(fd4,'Linewidth' r l,5); 

xlabel{'{\it f} (Hz)'); ylabel('{\it M}_d({\it f})')? 

title('recovered spectrum'); 


SSB-SC Modulation and Demodulation 

To illustrate the SSC-SC modulation and demodulation process, this exercise generates an SSB- 
SC signal using the same message signal mi(t) with double triangles* The carrier frequency 
is still 300 Hz. The MATLAB program ExampleSSBdemf ilt .m performs this function. 
Coherent demodulation is applied in which a simple low-pass filter with bandwidth of 150 Hz 
is used to distill the recovered message signal. 

The time domain signals are shown in Fig. 4.36, whereas the corresponding frequency 
domain signals are shown in Fig. 437. 


% (Examp1e S SBdemfi 11 * m) 

% This program uses triangl.m 

% to illustrate SSB modulation % and demodulation 
clear;clf; 

ts —1 * e-4; 
t = -0* 04:ts:0 * 04; 

Ta=0 * 01; fc = 3 0 0; 

m_sig=triangl((t + 0.01)/0 * 01)-triangl((t-0 * 01)/0.01) ; 
Lm_sig=length(m_sig); 
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Figure 4.36 
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Lfft=length(t); Lfft=2~ceil(log2(Lfft)); 

M_fre=fftshift(f f t(m_sig,Lfft)J; 
freqm=(-Lfft/2:Lfft/2-l)/(Lfft*ts); 

B_m=150; %Bandwidth of the signal is B_m Hz. 
h=firl(40,[B_m*ts]); 

s_dsb=m_sig **cos(2 *pi*fc*t); 

Lfft=length(t); Lfft=2"ceil(log2(Lfft)+1); 

3_dsb=fftshift(fft(s_dsb,Lfft)); 

L_lsb=floor( t c*ts*Lfft); 

SSBfilt=ones{1,Lfft) ; 

SSBfilt(Lfft/2-L_lsb+1:Lfft/2+L_lsb)=zeros(1,2*L_lsb); 

S_ssb=S_dsb,*SSBfilt; 

freqs=(-Lfft/2:Lfft/2-1)/(Lfft*ts); 

s_ssb-real(ifft(fftshift(S_ssb ))); 

s_ssb-s_ssb(1;Lm_sig) ; 

% Demodulation begins by multiplying with the carrier 
s_dem=s_ssb.*cos(2*pi*fc*t)*2; 

S_dem=fftshift(fft(s_dem,Lfft)); 

% Using an ideal LPF with bandwidth 150 Hz 
s_rec=fliter(h,1,s_dem); 

S_rec=fftshift(fft(s_rec,Lfft)); 

Trange=[-0*025 0.025 -1 1]; 
figure(1) 

subplot(221);tdl=plot(t,m_sig); 

axis(Trange); set(tdl,'Linewidth',1.5); 

xlabel('{\it t} (sec)'); ylabel('{\it m}(f\it t})'}; 

title('message signal'); 

subplot(222);td2=plot(t,s_ssb); 

axis(Trange); set(td2,'Linewidth',1.5); 

xlabel (' {\it t} (sec)'); ylabel('{\it s)_{\nrt SSB}({\it t})') 

title('SSB-SC modulated signal'); 

subplot(223);td3=plot(t,s_dem); 

axis(Trange); set(td3,'Linewidth',1.5); 

xlabel('{\it t) (sec)'); ylabel('(\it e}({\it t}} ') 

title('after multiplying local carrier'); 

subplot(224);td4=plot(t,s_rec); 

axis(Trange); set[td4,'Linewidth',1.5); 

xlabel('{\it t} (sec)'); ylabel('{\it m}_d({\it t})') 

title('Recovered signal'); 

Frange=[-700 700 0 200]; 
figure(2) 

subplot(221);fdl=plot(freqm,abs(M_fre)); 

axis(Frange); set(fdl,'Linewidth',1*5); 

xlabel('{\it f} (Hz)'); ylabel('{\it M]({\it f})'); 

title('message spectrum'); 
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subplot(222};fd2-plot(freqs,abs(S_ssb)); 
axis(Frange); set(Ed2,'Linewidth',1.5); 

xlabel('{\it f} (Hz)'); ylabel('(\it S}_{m DSB}({\it f}) " ) ; 

title('upper sideband SSB-SC spectrum'); 

subplot(223};fd3=plot(freqs,abs(S_dem)); 

axis(Frange); set(fd3,'Linewidth',1*5}; 

xlabel('{\it f) (Hz)'); ylabel('(\it E}({\it f})') ; 

title('detector spectrum'); 

subplot(224);£d4=plot(freqs,abs(S_rec)); 

axis(Frange); set(fd4,'Linewidth',1*5); 

xlabel ('{\ it f } (Hz)'); ylabel (' {\ it M}_d ({\ it f})'); 

title('recovered spectrum'); 


QAM Modulation and Demodulation 

In this exercise, we will apply QAM to modulate and demodulate two message signals m\ (r) 
and m 2 (t). The carrier frequency stays at 300 Hz, but two signals are simultaneously modulated 
and detected. The QAM signal is coherently demodulated by multiplying with cos 600n7 and 
sin 60Qjrr, respectively, to recover the two message signals. Each signal product is filtered 
by the same low-pass filter of order 40. The MATLAB program ExampleQAKtdemf il t * m 
completes this illustration by showing the time domain signals during the modulation and 
demodulation of the first signal m\(t) and the second signal The time domain results for 
m\{t) are shown in Fig. 4.38, whereas the frequency domain signals are shown in Fig. 4.39. 
Additionally, the time domain results for rri 2 (t) are shown in Fig. 4.40, whereas the frequency 
domain signals are shown in Fig. 4.41. 
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Figure 4.39 
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Figure 4.41 
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% (Examp 1 eQAMdemfilt.irt) 

% This program uses triangl.m and triplesinc.m 

% to illustrate QAM modulation % and demodulation 

% of two message signals 

clear;clf; 

ts = l,e-4; 

t = -0.04:ts:0 * 04; 

Ta=0.01; fc=300; 

% Use triangl.m and triplesinc,m to generate 
% two message signals of different shapes and spectra 
m_sigl=triangl( (t + G.01)/0.01)-triangl((t~0* 01)/0.01) ; 
m_sig2=triplesinc (t, Ta); 

Lm_sig=length[m_sigl); 

Lfft=length{t); Lfft -2 "ceil(log2(Lfft )) ? 

Ml_fre=fftshift(fft(m_sigl,Lfft)J; 

M2_fre=fftshift[fft(m_sig2,Lfft)); 
freqm-(-Lfft/2:Lfft/2-l)/(Lfft*ts); 

% 

B_m=150; ^Bandwidth of the signal is B_m Hz. 

% Design a simple lowpass filter with bandwidth B_m Hz. 
h=firl(40,[B_m*ts]); 

% QAM signal generated by adding a carrier to DSB-SC 
s_qam=m_sigl.*cos(2 *pi* fc*t)+m_sig2.*sin(2 *pi*fc*t } ; 

Lfft=length(t) ; Lfft=2~ceil(log2(LfftJ+1); 
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S_qam=fftshift(fft(s_qam,Lfft)); 
freqs=(-Lfft/2:Lfft/2-1)/(Lfft*ts); 

% Demodulation begins by using a. rectifier 
s_deml=s_qarru *cos (2*pi*fc*t) *2 ; 

S_deml=fftshift(fft(s_deml,Lfft)); 

% Demodulate the 2nd signal 
s__dem2 = s_qam. *sin (2*pi * f c*t) *2 ; 

S_dem2=fftshift(fft[s_dem2,Lfft)); 

% 

% Using an ideal LPF with bandwidth 150 Hz 


s_recl=filter (h, 1 H s__deml) ; 

S_recl=fftshift(fft(s_recl,Lfft)); 
s_rec2=filter(h,1,s_dem2); 

S_rec2=fftshift (f f t < s__rec2 , Lf f t) ) ; 

Trange=[-0,025 0.025 -2 2]; 

Trange2=[-0.025 0.025 -2 4]; 
figure[1} 

subplot(221) ; tdl=plot(t,m_sigl); 

axis(Trange); set(tdl,'Linewidth' r 1.5); 

xlabel{'{\it t} (sec > ') ; ylabel{ '{\it m}({\it t}} ' ) ; 

title(' message signal IN; 

subplot(222);td2=plot(t,s_qam); 

axis(Trange); set(td2,'Linewidth',1.5); 

xlabel('{\it t} (sec)'); ylabel('{\it s}_{\rm DSB)({\it t}) r ) 

title('QAM modulated signal'); 

subplot(223);td3=plot(t,s_deml}; 

axis(Trange2); set(td3,'Linewidth',1.5); 

xlabel('{\it t} (sec}'); ylabel('{\ifc x}({\it t})') 

title('first demodulator output'); 

subplot(224);td4=plot(t,s_recl); 

axis(Trange); set(td4 H 'Linewidth',1.5); 

xlabel('{\it t) (sec)'); ylabel('{\it m}_(dl}({\it t})'} 
titie('detected signal 1'); 

figure(2) 

subplot(221);td5=plot(t,m_sig2); 

axis(Trange); set(td5,'Linewidth 1 ,1,5}; 

xlabel('(\it t) (sec)'); ylabel('{\it m}({\it t)) ') ; 

title('message signal 2'); 

subplot(222);td6=plot(t,S_qam); 

axis(Trange); set(td6,'Linewidth',1,5); 

xlabel['{\it t} (sec>'); ylabel('(\it s}_{\rm DSB} ({\it t}) ') 

title('QAM modulated signal'); 

subplot(223);td7=plot(t,s_dem2); 

axis(Trange2); set(td7,'Linewidth',1.5); 

xlabel('{\it t} (sec)'); ylabel('{\it e}_l({\it t))'} 
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title('second demodulator output'); 
subplot[224);td8=plot{t,s_rec2); 
axis(Trange); set(td8,'Linewidth',1.5); 

xlabel('{\it t} (sec)'); ylabel( H {\it m}_{d2}({\it t})') 
title('detected signal 2'); 

Frange=[-700 700 0 250]; 
figure(3) 

subplot(221);fdl=plot(freqm,abs(Ml_fre)); 

axis(Frange); set(fdl H 'Linewidth',1.5); 

xlabel('{\it f} (Hz)'); ylabel('{\it M}({\it f})'); 

title('message 1 spectrum'); 

subplot(222);fd2=plot(freqs,abs(S_qam)); 

axis(Frange); set(fd2,'Linewidth',1*5); 

xlabel('{\it f} (Hz)'); ylabel['{\it S}_{rm AM}({\it f})'); 

title['QAM spectrum magnitude r ); 

subplot(223);fd3=plot(freqs,abs(S_deml)); 

axis(Frange); set(fd3,'Linewidth' r 1.5); 

xlabel('{\it f} (Hz)'); ylabel('{\it E}_l{{\it f})'); 

title('first demodulator spectrum'); 

subplot(224);fd4=plot(freqs,abs{S_recl)); 

axis(Frange); set(fd4,'Linewidth',1*5); 

xlabel('{\it f} (Hz) ') ; ylabel('{\it M}_{dl} ({\it f}) ') ; 
title('recovered spectrum 1'}; 
figure(4) 

subplot(221);fdl=plot(freqm,abs(M2_fre)); 

axis(Frange); set(fdl,'Linewidth',1.5); 

xlabel('{\it f} (Hz)'); ylabel('{\it M}({\it f})'); 

title('message 2 spectrum'); 

subplot (222) fd2=plot ( freqs, abs (S_qam) ) ; 

axis(Frange); set(fd2,'Linewidth',1.5); 

xlabel('{\it f} (Hz)'); ylabel('{\it S)_{rm AM)({\it f})'),- 

title['QAM spectrum magnitude'); 

subplot(223);fd7=plot[freqs,abs(S_dem2)); 

axis(Frange ); set(fd7,'Linewidth',1.5); 

xlabelf'{\it f) (Hz)'); ylabel('{\it E}_2({\it f})'); 

title('second demodulator spectrum'); 

subplot(224);fd8=plot(freqs,abs(S_rec2)); 

axis[Frange); setffdS,'Linewidth',1*5); 

xlabel('(\it f} (Hz)'); ylabel('(\it M)_{d2)({\it f})'); 
title('recovered spectrum 2'); 
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PROBLEMS 


4.2- 1 For each of the baseband signals: (i) m{t) = cos KKXbr/; (ii) m{t) — 2cos 1000^/ + 

sin 2000/r/; (iii) m(t) = cos IOOOjt/cos 3000/^ do the following. 

(a) Sketch the spectrum of m(t). 

(b) Sketch the spectrum of the DSB-SC signal m(/) cos lO,00G/rr. 

(c) Identify the upper sideband (USB) and the lower sideband (LSB) spectra, 

(d) Identify the frequencies in the baseband, and the corresponding frequencies in the DSB-SC, 
USB, and LSB spectra. Explain the nature of frequency shifting in each case, 

4.2- 2 Repeat Prob. 4.2-1 [parts (a), (b), and (c) only] if: (i) m(t) = sine(100/); (ii) m(t) = 

(iii) m(t) = c - l r “*L Observe that [ s e -\t\ delayed by 1 second. For the last case you 

need to consider both the amplitude and the phase spectra, 

4.2- 3 Repeat Prob, 4,2-1 [parts (a), (b), and (c) only] forra(/) = e~^ if the carrier is cos (10,1000/ - 

tt/4). 

Hint: Use Eq. (337), 

4.2- 4 You are asked todesign a DSB-SC modulator to generate a modulated signal km(t) cos (fr^r+fl), 

where m(t) is a signal band-limited to B Hz, Figure P4,2-4 shows a DSB-SC modulator available 
in the stock room. The carrier generator available generates not cos o> c t* but cos 3 w c t. Explain 
whether you would be able to generate the desired signal using only this equipment. You may 
use any kind of filter you like. 

(a) What kind of filter is required in Fig. P4.2-3 ? 

(b) Determine the signal spectra at points b and c, and indicate the frequency bands occupied 
by these spectra, 

(c) What is the minimum usable value of 

(d) Would this scheme work if the carrier generator output were sin 3 1%/7 Explain. 

(f) Would this scheme work if the carrier generator output were cos" ca c t for any integer n > 2 ? 


Figure P.4.2-4 
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4*2-5 You are asked to design a DSB-SC modulator to generate a modulated signal hn(t) cos o> c t 
with the carrier frequency f c = 300 kHz (<o c = 2n x 300,000). The following equipment is 
available in the stock room: (i) a signal generator of frequency 100 kHz ; <il> a ring modulator; 
(iii) a bandpass filter tuned to 300 kHz, 

(a) Show how you can generate the desired signal. 

(b) If the output of the modulator is k * m(t) cos a> c t, find k , 

4*2-6 Amplitude modulators and demodulators can also be built without using multipliers* In Fig. P4.2- 
6, the input 0(f) = m(t ), and the amplitude A > [0(f) |* The two diodes are identical, with a 
resistance of r ohms in the conducting mode and infinite resistance in the cutoff mode. Show 
that the output ^(f) is given by 

ItR 

e o(t) = TT-r- w ( ( ) m ( f ) 

R + r 

where w(t) is the switching periodic signal shown in Fig, 2*20a with period 2 n/W c seconds. 

(a) Hence, show that this circuit can be used as a DSB-SC modulator, 

(b) How would you use this circuit as a synchronous demodulator for DSB-SC signals. 



4.2- 7 In Fig. P4.2-6, if 0(f) = sin + 0), and the output co(r) is passed through a low-pass filter, 

then show that this circuit can be used as a phase detector, that is, a circuit that measures the 
phase difference between two sinusoids of the same frequency (&) r >* 

Him: Show that the filter output is a dc signal proportional to sin $< 

4.2- 8 Two signals mi (t) and both band-limited to 5000 Hz, are to be transmitted simultaneously 

over a channel by the multiplexing scheme shown in Fig, P4.2-8. The signal at point b is the 
multiplexed signal, which now modulates a carrier of frequency 20,000 Hz. The modulated 
signal at point c is transmitted over a channel. 

(a) Sketch signal spectra at points a , b T and c. 

(b) What must be the bandwidth of the channel? 

(c) Design a receiver to recover signals mi(f) and mjW from the modulated signal at 
point c. 

4.2- 9 The system shown in Fig. P4,2-9 is used for scrambling audio signals. The output y(r) is the 

scrambled version of the input m(f), 

(a) Find the spectrum of the scrambled signal y(f)* 

(b) Suggest a method of descrambling y(f) to obtain m{i). 
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Figure P.4.2-8 


Figure P.4.2-9 


Figure P.4.3-1 


Figure P.4.3-2 



A slightly modified version of this scrambler was first used commercially on the 25-mile radio¬ 
telephone circuit connecting Los Angeles and Santa Catalina island. 



(Scrambled output) 


4.2- 10 A DSB-SC signal is given by m(t) cos (2 jt) 10 6 r. The carrier frequency of this signal, 1 MHz, 

is to be changed to 400 kHz, The only equipment available consists of one ring modulator, 
a bandpass filter centered at the frequency of 400 kHz, and one sine wave generator whose 
frequency can be varied from 150 to 210 kHz. Show how you can obtain the desired signal 
emit) cos (2 n x 400 x 10 3 /) from m(t)cos (2vt)10 6 j. Determine the value of c. 

4.3- 1 Figure P4.3-1 shows a scheme for coherent (synchronous) demodulation. Show that this scheme 

can demodulate the AM signal [A + tf?(f)] cos (2 nf c t) regardless of the value of A> 


Output 


cos <o c t 


[A + m(f)] cos 




Low pass 
filter 


4.3-2 Sketch the AM signal [A 4- m(r)]cos for the periodic triangle signal w(r) shown in 

Fig. P4.3-2 corresponding to the modulation indices (a) — 0*5; (b) fj. = 1; (c) \x — 2; (d) 

li = oo. How do you interpret the case of \i — oo? 
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4*3-3 For the AM signal with m(t) shown in Fig. P4.3-2 and ^ = 0*8: 

(a) Find the amplitude and power of the carrier. 

(b) Find the sideband power and the power efficiency rj. 

4.3- 4 (a) Sketch the DSB-SC signal corresponding to the message signal m(f) = cos 2m. 

(b) The DSB-SC signal of part (a) is applied at the input of an envelope detector. Show that 
the output of the envelope detector is not w(f), but [m(r)|. Show' that, in general, if an AM 
signal [A + w(f)] cos oj c i is envelope-detected, the output is \A -h m(f)|* Hence, show that 
the condition for recovering m(t) from the envelope detector is A + m{t) > 0 for all r. 

4.3- 5 Show that any scheme that can be used to generate DSB-SC can also generate AM. Is the 

converse true? Explain. 

4.3- 6 Show that any scheme that can be used to demodulate DSB-SC can also demodulate AM. Is the 

converse true? Explain. 

4.3- 7 In the text, the power efficiency of AM for a sinusoidal m(f) was found. Carry out a similar 

analysis when m(t) is a random binary signal as shown in Fig. P4.3-7 and fi = l. Sketch the 
AM signal with p = 1. Find the sideband’s power and the total power (power of the AM signal) 
as well as their ratio (the power efficiency /?), 


Figure P,4*3-7 


4 


A 


+ 


4,3-8 In the early days of radio, AM signals were demodulated by a cry stal detector followed by a 
low-pass filter and a dc blocker, as shown in Fig. P4.3-8. Assume a crystal detector to be basically 
a squaring device. Determine the signals at points a , b, t\ and d. Point out the distortion term in 
the output y(r). Show that if A |m(f)|, the distortion is small. 


Figure P*4»3-8 



4,4-1 In a QAM system (Fig. 4 + 19), the locally generated carrier has a frequency error Aa> and a phase 
error <5; that is, the receiver carrier is cos [((o c + Aa))t + S] or sin [(&> c + Ao>)f -h 5]. Show that 
the output of the upper receiver branch is 

rai(r) cos [(Aw)/ + &\ — m 2 (t) sin [(A^)/ + <5] 
instead of (f), and the output of the lower receiver branch is 


mi (?) sin [(Aw)r 4- 5] + m 2 (t) cos L(Ao))£ T 51 


instead of wjW- 
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4.4- 2 A modulating signal m(t) is given by: 

(a) tn(t) — cos 100;rr-H2cos 300;rf 

(b) m(t) = sin 1007 rfsin 5007 rf 

In each case: 

(i) Sketch the spectrum of m(r)< 

(ii) Find and sketch the spectrum of the DSB-SC signal 2 m(t) cos 100O7rr. 

(iii) From the spectrum obtained in (ii), suppress the LSB spectrum to obtain the USB spectrum. 

(iv) Knowing the USB spectrum in (ii), write the expression g? USB (r) for the USB signal. 

(v) Repeat (ill) and (iv) to obtain the LSB signal <p LSB ( t ), 

4.4- 3 For the signals in Prob. 4,4-2, use Eq. (4.20) to determine the time domain expressions V4_.SB(0 

and <pusb (0 ft the carrier frequency w c = 1000. 

Hint: If m(f) is a sinusoid, its Hilbert transform m^U) is the sinusoid m{t) phase-delayed by tt/ 2 
rad. 

4.4- 4 Find ^lsb(0 RjsbW f° r ^ modulating signal m(t) — tzBsiwc 2 (2 nBt) with B = 2000 

Hz and carrier frequency f c = 10,000 Hz. Follow these steps: 

(a) Sketch spectra of m(t) and the corresponding DSB-SC signal 2 m(t) cos u> c t. 

(b) To find the LSB spectrum, suppress the USB in the DSB-SC spectrum found in part (a). 

(c) Find the LSB signal <plsb( 0, which is the inverse Fourier transform of the LSB spectrum 
found in part (b). Follow a similar procedure to find ^usbW- 

4.4- 5 If m^it) is the Hilbert transform of m{t) t then 

(a) Show that the Hilbert transform of m^it) is - m(t ), 

(b) Show also that the energies of m(;) and m^t) are identical, 

4.4- 6 An LSB signal is demodulated coherently, as shown in Fig. P4.4-6. Unfortunately, because 

of the transmission delay, the received signal carrier is not 2 cos w c t as sent, but rather, is 
2 cos [{w^ -h + &]. The local oscillator is still cos <o c t . Show the following. 

(a) When 5 = 0, the output y(t) is the signal m(t) with all its spectral components shifted 
(offset) by Aw. 

Him: Observe that the output y(0 is identical to the right-hand side of Eq, (4,20a) with o> c 
replaced with Aw, 

(b) When Aw = 0, the output is the signal m(t) with phases of all its spectral components 
shifted by 5, 

Hint: Show that the output spectrum F(/) = M(/)^ 5 for/ > 0, and equal to M 
when/ < 0, 

(c) In each of these cases, explain the nature of distortion. 

Hint: For part (a), demodulation consists of shifting an LSB spectrum to the left and right 
by w c + Aw and low-pass-filtering the result. For part (b), use the expression (4.20b) for 
^Lsb(*)> multiply it by the local carrier 2 cos (w c t + 5), and low-pass-filter the result. 
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4.4- 7 A USB signal is generated by using the phase shift method (Fig, 4,17), If the input to this system 

is instead ofm(r), what will be the output? Is this signal still an SSB signal with bandwidth 
equal to that of m(f)? Can this signal be demodulated [to get back m(f)l? If so, how? 

4.5- 1 A vestigial filter Hf(f) shown in the transmitter of Fig. 4.21 has a transfer function as shown in 

Fig. P4.5-1. The carrier frequency is/ r = 10 kHz and the baseband signal bandw idth is 4 kHz. 
Find the corresponding transfer function of the equalizer filter H 0 (f) shown in the receiver of 
Fig + 4,21. 

Hint: Use Eq. (4,25). 









ANGLE MODULATION AND 
DEMODULATION 


A s discussed in the previous chapter, a carrier modulation can be achieved by modu¬ 
lating the amplitude, frequency, and phase of a sinusoidal carrier of frequency f c . 
In that chapter, we focused on various linear amplitude modulation systems and their 
demodulations. Now we discuss nonlinear frequency modulation (FM) and phase modulation 
(PM), often collectively known as angle modulation. 


5.1 NONLINEAR MODULATION 

In AM signals, the amplitude of a carrier is modulated by a signal m(t }, and, hence, the 
information content of m(t) is in the amplitude variations of the carrier As we have seen, 
the other two parameters of the carrier sinusoid, namely its frequency and phase, can also 
be varied in proportion to the message signal as frequency-modulated and phase-modulated 
signals, respectively. We now describe the essence of frequency modulation (FM) and phase 
modulation (PM) + 

False Start 

In the 1920s, broadcasting was in its infancy. However, there was an active search for techniques 
to reduce noise (static). Since the noise power is proportional to the modulated signal band¬ 
width (sidebands), efforts were focused on finding a modulation scheme that would reduce the 
bandwidth. More important still, bandwidth reduction also allows more users, and there were 
rumors of a new method that had been discovered for eliminating sidebands (no sidebands, no 
bandwidth!). The idea of frequency modulation (FM), where the carrier frequency would be 
varied in proportion to the message m{t), was quite intriguing. The carrier angular frequency 
w(t) would be varied with time so that 0 ){t) ~ -I- km(t ), where k is an arbitrary constant. 

If the peak amplitude of m(t) is m p , then the maximum and minimum values of the carrier 
frequency would be a> c + km p and a) c — km pt respectively. Hence, the spectral components 
would remain within this band with a bandwidth 2 km p centered at co c . The understanding was 
that controlling the constant parameter k can control the modulated signal bandwidth. While 
this is true, there was also the hope that by using an arbitrarily small k , we could make the 
information bandwidth arbitrarily small. This possibility was seen as a passport to communi¬ 
cation heaven. Unfortunately, experimental results showed that the underlying reasoning was 
seriously wrong. The FM bandwidth, as it turned out, is always greater than (at best equal to) 
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the AM bandwidth. In some cases, its bandwidth was several times that of AM. Where was 
the fallacy in the original reasoning? We shall soon find oat. 

The Concept of Instantaneous Frequency 

While AM signals carry a message with their varying amplitude, FM signals can vary the 
instantaneous frequency in proportion to the modulating signal m(t). This means that the 
carrier frequency is changing continuously every instant. Prima facie, this does not make 
much sense, since to define a frequency, we must have a sinusoidal signal at least over one 
cycle (or a half-cycle or a quarter-cycle) with the same frequency. This problem reminds us 
of our first encounter with the concept of instantaneous velocity in a beginning mechanics 
course. Until the presentation of derivatives via Leibniz and Newton, we were used to thinking 
of velocity as being constant over an interval, and we were incapable of even imagining that 
velocity could vary at each instant. We never forget, however, the wonder and amazement that 
was caused by the contemplation of derivative and instantaneous velocity when these concepts 
were first introduced. A similar experience awaits the reader with respect to instantaneous 
frequency. 

Let us consider a generalized sinusoidal signal <p(t) given by 

<p(0 — Acos 0(0 (5/1) 

where 0(t) is the generalized angle and is a function of r. Figure 5.1 shows a hypothetical 
case of 0(f), The generalized angle for a conventional sinusoid A cos (oj c t + $o) is a straight 
line Qi c t -f &Q, as shown in Fig. 5.1. A hypothetical case general angle of 0(0 happens to be 
tangential to the angle (a> c t A- Oq ) at some instant f. The crucial point is that, around L over 
a small interval At 0, the signal <p(t) — A cos 0(t) and the sinusoid A eos^f ■+ 0 q) are 
identical; that is, 


(p(t) = A cos (aj c t + 0o) t\ < r < *2 

We are certainly justified in saying that over this small interval At , the angular frequency of 
(p(t) is Because (w c r + 0q) is tangential to 0(f), the angular frequency of <p(t) is the slope 
of its angle $(t) over this small interval. We can generalize this concept at every instant and 
define that the instantaneous frequency coj at any instant t is the slope of $(t) at t. Thus, for 


Figure 5,1 

Concept of 

instantaneous 

frequency. 
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<p(t) in Eq. (5.1), the instantaneous angular frequency and the generalized angle are related via 


&>;( 0 


e(t) 


dG 


dt 



ct>i(of) dot 


(5.2a) 

(5.2b) 


Now we can see the possibility of transmitting the information of m(t) by varying the angle 0 of 
a carrier. Such techniques of modulation, where the angle of the carrier is varied in some manner 
with a modulating signal m(f), are known as angle modulation or exponential modulation. 
Two simple possibilities are phase modulation (PM) and frequency modulation (FM). In 
PM, the angle 6(t) is varied linearly with m{t)\ 


0{t) = eo c t + + k p m(t) 

where k p is a constant and w c is the carrier frequency. Assuming Gq = 0, without loss of 
generality, 


6(t) — <D c t + k p m{t) 


(5.3a) 


The resulting PM wave is 


P PM (0 = A cos [<o c t 4- k p m(t)\ (5.3b) 

The instantaneous angular frequency w/(r) in this case is given by 

d$ 

&i(t) = — = oj c + k p m(t) (5.3c) 

dt 

Hence, in PM, the instantaneous angular frequency o>i varies linearly with the derivative of 
the modulating signal. If the instantaneous frequency a>i is varied linearly with the modulating 
signal, we have FM. Thus, in FM the instantaneous angular frequency is 

o>i(t) = oj c + kfm(t) (5.4a) 

where kf is a constant. The angle G(t) is now 

0(f) = f [&>c + kftn(a)]da 

J-oc 

= Ct) c t + kf I m(a)dct 

J-oo 

Here we have assumed the constant term in 0 (/) to be zero without loss of generality. The FM 
wave is 


Pfm (0 = A cos 



m(ct) dot 


(5.5) 
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Figure 5.2 

Phase and 
frequency 
modulation are 
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interchangeable. 



Relationship between FM and PM 

From Eqs. (5.3b) and (5.5), it is apparent that PM and FM not only are very similar but are 
inseparable. Replacing m(t) in Eq. (5.3b) with / m(a)da changes PM into FM. Thus, a signal 
that is an FM wave corresponding to m(t) is also the PM wave corresponding to / m(a)da 
(Fig. 5,2a). Similarly, a PM wave corresponding to m(t) is the FM wave corresponding to 
m(t) (Fig. 5.2b). Therefore, by looking only at an angle-modulated signal <p(t ), there is no way 
of telling whether it is FM or PM, Tn fact, it is meaningless to ask an angle-modulated wave 
whether it is FM or PM. Tt is analogous to asking a married man with children whether he is 
a father or a son. This discussion and Fig. 5.2 also show that we need not separately discuss 
methods of generation and demodulation of each type of modulation. 

Equations (5.3b) and (5.5) show that in both PM and FM the angle of a carrier is varied 
in proportion to some measure of m(t). In PM, it is directly proportional to m(t ), whereas in 
FM, it is proportional to the integral of m(t). As shown in Fig. 5.2b, a frequency modulator 
can be directly used to generate an FM signal or the message input m(t) can be processed 
by a filter (differentiator) with transfer function H(s) = s to generate PM signals. But why 
should we limit ourselves to these cases? We have an infinite number of possible ways of 
processing m(t) before FM. If we restrict the choice to a linear operator, then a measure of 
m(t) can be obtained as the output of an invertible linear (time-invariant) system with transfer 
function//(^’) or impulse response A(r). The generalized angle-modulated carrier <p rM (f) can be 
expressed as 


P EM (0 = A cos + ir(t)] (5.6a) 

= A cos + J m(a) h(t — a) da j (5.6b) 

As long as H($) is a reversible operation (or invertible), m(t) can be recovered 
from i/(t) by passing it through a system with transfer function [//(s)] -1 as shown in 
Fig. 5.3. Now PM and FM are just two special cases with h(t) = k p S(t) and h(t) = kfu(t) y 
respectively. 

This shows that if we analyze one type of angle modulation (such as FM), we can readily 
extend those results to any other kind. Historically, the angle modulation concept began with 
FM, and here in this chapter we shall primarily analyze FM, with occasional discussion of 
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Figure 5*3 
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phase 
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the modulated 
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the inverse filter 


m(t) 

tf(.v) 

tj/(t)=fm(a)h(t-a) da 

1 

) 



H(s) 



PM, But this does not mean that FM is superior to other kinds of angle modulation. On the 
contrary, for most practical signals, PM is superior to FM. Actually, the optimum performance 
is realized neither by pure PM nor by pme FM, but by something in between. 


Power of an Angle-Modulated Wave 

Although the instantaneous frequency and phase of an angle-modulated wave can vary with 
time, the amplitude A remains constant. Hence, the power of an angle-modulated wave 
(PM or FM) is always A 2 /2, regardless of the value of k p or kf< 


Example 5.1 Sketch FM and PM waves for the modulating signal m(t) shown in Fig t 5.4a, The constants 
kf and k p are In x 10 5 and IOtt , respectively, and the carrier frequency f c is 100 MHz. 


Figure 5*4 

FM and PM 
waveforms. 



For FM: 


a>i =• oj c -\- kfm(t) 
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| Dividing throughout by 2 tt, we have the equation in terms of the variable/ (frequency in 
|f hertz). The instantaneous frequency/ is 



kf 

h = / + —m(t) 

2n 

= io* + ioVo 

(ft)™ = 10 8 + 10 5 = 99,9 MHz 

(fiW = 10 8 + 10 5 = 100.1 MHz 

Because m(r) increases and decreases linearly with time, the instantaneous frequency 
increases linearly from 99.9 to 100.1 MHz over a half-cycle and decreases linearly from 
100 A to 99*9 MHz over the remaining half-cycle of the modulating signal (Fig, 5,4b), 
PM for m(t) is FM for m(r). This also follows from Eq. (5.3c). 

For PM: 

fi =fc + 

2 71 

= 10 s + 5 m(0 

(fi)min = 10 8 + 5[mW] min = 10 8 - 10 s = 99.9 MHz 
(fiW = 10 8 + 5 [rn{()] nm = 100.1 MHz 

Because m(t) switches back and forth from a value of —20,000 to 20,000, the carrier 
frequency switches back and forth from 99.9 to 100.1 MHz every half-cycle of m(t), as 
shown in Fig. 5.4d. 


This indirect method of sketching PM [using m{t) to frequency-modulate a carrier] works 
as long as m(t) is a continuous signal. If m(t) is discontinuous, it means that the PM sig¬ 
nal has sudden phase changes and, hence, tti(r) contains impulses. This indirect method 
fails at points of the discontinuity , In such a case, a direct approach should be used at the 
point of discontinuity to specify the sudden phase changes. This is demonstrated in the next 
example. 

Example 5,2 Sketch FM and PM waves for the digital modulating signal m{t) shown in Fig. 5.5a. The 
constants kf and k p are In x 10 5 and jr/2, respectively, and/; = 100 MHz. 

For FM: 

Because m{t) switches from 1 to —1 and vice versa, the FM wave frequency switches 
back and forth between 99.9 and 100.1 MHz, as shown in Fig. 5.5b. This scheme of carrier 
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Figure 5.5 

FM and PM 
waveforms. 
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frequency modulation by a digital signal (Fig. 5.5b) is called frequency shift keying 
(FSK) because information digits are transmitted by keying different frequencies (see 
Sec. 7.8). 

For PM: 


fi=fc + ~m(t) = 10 8 + jw(0 
2tt 4 

The derivative m{t) (Fig. 5 + 5c) is zero except at points of discontinuity of m(t) where 
impulses of strength ±2 are present. This means that the frequency of the PM signal stays 
the same except at these isolated points of time! It is not immediately apparent how an 
instantaneous frequency can be changed by an infinite amount and then changed back to 
the original frequency in zero time. Let us consider the direct approach: 


Ppm<0 = A cos [&ct + kp?n(t)] 

— A cos ^co c t H- ^-m(r)J 

{ A sin lo c t when mit) = — 1 
—A sin co L t when m(t) — 1 


I 

I 


& 

3 


I 


This PM wave is shown in Fig. 5.5d. This scheme of carrier PM by a digital signal is 
called phase shift keying (PSK) because information digits are transmitted by shift¬ 
ing the carrier phase. Note that PSK may also be viewed as a DSB-SC modulation 
by m(t). 

The PM wave in this case has phase discontinuities at instants where impulses 
of m(t) are located. At these instants, the carrier phase shifts by tt instantaneously. A finite 
phase shift in zero time implies infinite instantaneous frequency at these instants. This 
agrees with our observation about m(r). 

The amount of phase discontinuity in tp PM (f) at the instant where m(t) is discontinuous 
is k p m d , where m d is the amount of discontinuity in m{t) at that instant. In the present 
example, the amplitude of m(t) changes by 2 (from “1 to 1) at the discontinuity. Hence, 
the phase discontinuity in p PM (r) is k p m d — (jt/2) x 2 = tt rad, which confirms our 
earlier result. 
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When m{t) is a digital signal (as in Fig, 5.5a), shows a phase discontinuity 

where m(t) has a jump discontinuity. We shall now show that to avoid ambiguity in 
demodulation, in such a case, the phase deviation k p m(t) must be restricted to a range 
(—jr, it). For example, if k p were 3n /2 in the present example, then 


ip VM (r) = A cos 


3jt 

oj c t H- — m 


(0 


In this case <p PM (0 — A sin (o c t when m(t) = 1 or —1/3. This will certainly cause 
% ambiguity at the receiver when A sin a> c t is received. Specifically, the receiver cannot 
V decide the exact value of m(t). Such ambiguity never arises if k p m{t) is restricted to the 
range (— tt, n). 


What causes this ambiguity? When m(t) has jump discontinuities, the phase of p PM (f) 
changes instantaneously. Because a phase ip 0 4- 2nn is indistinguishable from the phase (p 0 , 
ambiguities will be inherent in the demodulator unless the phase variations are limited to the 
range (—jr, tt). This means k p should be small enough to restrict the phase change k p m(t) to 
the range (-tt, tt). 

No such restriction on k p is required if m(t) is continuous. In this case the phase change is 
not instantaneous, but gradual over time, and a phase <p 0 -\-2nn will exhibit n additional carrier 
cycles in the case of phase of only <p 0 . We can detect the PM wave by using an FM demodulator 
followed by an integrator (see Prob. 5.4-1). The additional n cycles will be detected by the 
FM demodulator, and the subsequent integration will yield a phase 2 nn. Hence, the phases <p a 
and <p a + Inn can be detected without ambiguity. This conclusion can also be verified from 
Example 5.1, where the maximum phase change Atp = 10tt. 

Because a band-limited signal cannot have jump discontinuities, we can also say that when 
m(t) is band-limited, k p has no restrictions. 


5.2 BANDWIDTH OF ANGLE-MODULATED WAVES 

Unlike AM, angle modulation is nonlinear and no properties of Fourier transform can be 
directly applied for its bandwidth analysis. To determine the bandwidth of an FM wave, let us 
define 


a(t) = 



m(a) da 


and define 




such that its relationship to the FM signal is 

vw(0 = R e|> FM (0] 

Expanding the exponential of Eq. (5.8a) in power series yields 


k 2 k n 

1 +jk f a(t) - -L a 2 (t ) + ■ • • +/ -La»{t ) + ... 


e>° kt 


(5.7) 


(5.8a) 


(5.8b) 


9m (0 = A 


(5.9a) 
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and 


= Re [?fm(0J 


cos cti c t — k/a(t ) sin w £ t 


k? P 

2 i 

j^a (t) cos co c t + — it) sin oj c t + 


(5.9b) 


The modulated wave consists of an unmodulated carrier plus various amplitude-modulated 

terms, such as a(f)sin (o c t, a 2 {f)c os co c t, £j 3 (0sin aj c t .The signal ait) is an integral 

of m(t)< If Mif) is band-limited to B, Aif) is also band-limited* to B. The spectrum of 
a 2 (t) is simply A(f) * Mf) and is band-limited to 2 B. Similarly, the spectrum of a*(t) is 
band-limited to nB. Hence, the spectrum consists of an unmodulated carrier plus spectra of 
a{t), a 2 {t), . *., a n {t)y * - ■, centered at Clearly, the modulated wave is not band-limited. 
It has an infinite bandwidth and is not related to the modulating-signal spectrum in any simple 
way, as was the case in AM. 

Although the bandwidth of an FM wave is theoretically infinite, for practical signals with 
bounded |a(f)l, lfytf(0l will remain finite. Because n\ increases much faster than we 

have 


kSa n {t) 

’—;— — 0 for large n 

n\ 

Hence, we shall see that most of the modulated-signal power resides in a finite bandwidth. 
This is the principal foundation of the bandwidth analysis for angle-modulations. There are 
two distinct possibilities in terms of bandwidths—narrowband FM and wideband FM. 

Narrowband Angle Modulation Approximation 

Unlike AM, angle modulations are nonlinear. The nonlinear relationship between ait) and <p(t) 
is evident from the terms involving a n (t) in Eq, (5.9b). When kf is very small such that 

\kfa(t)\ « 1 

then all higher order terms in Eq. (5.9b) are negligible except for the first two. We then have a 
good approximation 


^ A [cos a) c t - kfa(t) sin <o c t\ (5.10) 

This approximation is a linear modulation that has an expression similar to that of the AM 
signal with message signal a(t). Because the bandwidth of a(t) is BHz, the bandwidth of 
in Eq. (5.10) is 2 B Hz according to the frequency-shifting property due to the term 
a(t) sin cv c t. For this reason, the FM signal for the case of |^a(f)| « 1 is called narrowband 
FM (NBFM). Similarly, the narrowband PM (NBPM) signal is approximated by 

<Pvu (0 ^ a [cos ( 0 c t - k p m{t) sin a> c t] (5.11) 

NBPM also has the approximate bandwidth of 2B. 


* This is because integration is a linear operation equivalent to passing a signal through a transfer function l/;27r/. 
Hence, if M (f) is band-limited to B , Aif ) must also be band-limited to B. 
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A comparison of NBFM [Eq. (5.10)] with AM [Eq. (5.9a)] brings out clearly the similarities 
and differences between the two types of modulation. Both have the same modulated bandwidth 
2 B. The sideband spectrum for FM has a phase shift of tt/ 2 with respect to the carrier, whereas 
that of AM is in phase with the carrier. Tt must be remembered, however, that despite the 
apparent similarities, the AM and FM signals have very different waveforms. In an AM signal, 
the oscillation frequency is constant and the amplitude varies with time, whereas in an FM 
signal, the amplitude stays constant and the frequency varies with time. 


Wideband FM (WBFM) Bandwidth Analysis: The Fallacy Exposed 

Note that an FM signal is meaningful only if its frequency deviation is large enough. In other 
words, practical FM chooses the constant kf large enough that the condition |&/tf(/)| ^ 1 
is not satisfied. We call FM signals in such cases wideband FM (WBFM). Thus, in ana¬ 
lyzing the bandwidth of WBFM, we cannot ignore all the higher order terms in Eq, (5.9b). 
To begin, we shall take here the route of the pioneers, who by their intuitively simple rea¬ 
soning came to grief in estimating the FM bandwidth. If we could discover the fallacy in 
their reasoning, we would have a chance of obtaining a better estimate of the (wideband) FM 
bandwidth. 

Consider a low-pass mil) with bandwidth B Hz. This signal is well approximated by a 
staircase signal m(r), as shown in Fig. 5.6a. The signal m{t) is now approximated by pulses of 
constant amplitude. For convenience, each of these pulses will be called a "cell” To ensure 


Figure 5.6 
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that m(t) has all the information of m(t), the cell width in m(t) must be no greater than the 
Nyquist interval of 1 j2B second according to the sampling theorem (Chapter 6). 

It is relatively easier to analyze FM corresponding to m(t) because its constant amplitude 
pulses (cells) of width T = 1/2B second. Consider a typical cell starting at t — t k . This 
cell has a constant amplitude Hence, the FM signal corresponding to this cell is a 

sinusoid of frequency + kyrnff) and duration T = 1/2 B, as shown in Fig. 5.6b, The FM 
signal for m(r) consists of a sequence of such constant frequency sinusoidal pulses of duration 
T = 1/2 B corresponding to various cells of m{t )♦ The FM spectrum for m{t) consists of 
the sum of the Fourier transforms of these sinusoidal pulses corresponding to all the cells. 
The Fourier transform of a sinusoidal pulse in Fig. 5.6b (corresponding to the kih cell) is a 
sine function shown shaded in Fig. 5.6c see Eq. (3.27a) with r = 1/2 B and Eq. (3.26) with 
/o =fc + k f m(t k )/2ji: 


rect(2ffr) cos [ co c t 4 kj m(t k )t] 


-sine 


~to + co € 4 kfm{t k y 

1 . 

'to - c o c — kfmitif) " 

+ -s'nc 

4 B 

4B 


Note that the speetrum of this pulse is spread out on either side of its center frequency a> c 4 
kftn(tk) by 4nB as the main lobe of the sine function. Figure 5.6c shows the spectra of sinusoidal 
pulses corresponding to various cells. The minimum and the maximum amplitudes of the cells 
are —m p and m p , respectively. Hence, the minimum and maximum center frequencies of 
the short sinusoidal pulses corresponding to the FM signal for all the cells are a) c - kym p 
and o) c 4- kfm p , respectively. Consider the sine main lobe of these frequency responses as 
significant contribution to the FM bandwidth, as shown in Fig. 5.6c. Hence, the maximum and 
the minimum significant frequencies in this spectrum areco t 4&/m /? 44;rZ?anda> c -fym /7 -47r£, 
respectively. The FM spectrum bandwidth is approximately 


#FM = r^-{2 kfM p 4 SttB) — 2 ( ^ 4 - 2 B 

\ 2jz 

We can now understand the fallacy in the reasoning of the pioneers. The maximum and 
minimum carrier frequencies are w c 4-kf m p and o) c —kfM p , respectively. Hence, it was reasoned 
that the spectral components must also lie in this range, resulting in the FM bandwidth of 2kj m p . 
The implicit assumption was that a sinusoid of frequency a> has its entire spectrum concentrated 
at oj. Unfortunately, this is true only of the everlasting sinusoid with T — 00 (because it turns 
the sine function into an impulse). For a sinusoid of finite duration T seconds, the spectrum is 
spread out by the sine on either side of to by at least the main lobe width of 2 tt/ 7\ The pioneers 
had missed this spreading effect. 

For notational convenience, given the deviation of the carrier frequency (in radians per 
second) by =b k/m p , we shall denote the peak frequency deviation in hertz by A/. Thus, 



A/ = 


, — W|nm 

kf - 

; 2 * 2tt 


= / = £/ — 

J j 2 7T 


The estimated FM bandwidth (in hertz) can then be expressed as 


£fm-2(A/ + 2S) (5.12) 

The bandwidth estimate thus obtained is somewhat higher than the actual value because this 
is the bandwidth corresponding to the staircase approximation of m(t ), not the actual m(t ), 
which is considerably smoother. Hence, the actual FM bandwidth is somewhat smaller than 
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this value. Based on Fig. 5.6c, it is dear that a better FM bandwidth approximation is between 

12A /, 2A/ + 4 B] 

Therefore, we should readjust our bandwidth estimation. To make this midcourse correction, 
we observe that for the case of NBFM, kf is very small. Hence, given a fixed m p , A/ is 
very small (in comparison to B) for NBFM. Tn this case, we can ignore the small A/ term in 
Eq. (5.12) with the result 


#fm ^ 4ZJ 

But we showed earlier that for narrowband, the FM bandwidth is approximately 2 B Hz. This 
indicates that a better bandwidth estimate is 

fiFM = 2(4/ + B)= 2 (5.13) 

This is precisely the result obtained by Carson, 1 who investigated this problem rigorously 
for tone modulation [sinusoidal m{i )]. This formula goes under the name Carson’s rule 
in the literature. Observe that for a truly wideband case, where A/ Eq. (5.13) can be 
approximated as 


Af»B (5.14) 

Because A o) — kjm p> this formula is precisely what the pioneers had used for FM bandwidth. 
The only mistake was in thinking that this formula will hold for all cases, especially for the 
narrowband case, where A/ <£ 0. 

We define a deviation ratio ft as 


/» = -£- ( 5 . 15 ) 

Carson’s rule can be expressed in terms of the deviation ratio as 

0fm = 20(0+1) (5-16) 

The deviation ratio controls the amount of modulation and, consequently, plays a role 
similar to the modulation index in AM. Indeed, for the special case of tone-modulated FM, the 
deviation ratio 0 is called the modulation index. 

Phase Modulation 

All the results derived for FM can be directly applied to PM. Thus, for PM, the instantaneous 
frequency is given by 


o>i = + k p m(t) 

Therefore, the peak frequency deviation A f is given by 

_ , MO ] max [«(0 min] 
A/ = k p - 


2 ■ 2k 


(5.17a) 
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If we assume that 


i” p = Lm</)W - - |/h(i) m inl (5,17b) 

then 

m n 

^ =kp 2 k (5 * 17c) 

Therefore,* 

Sp M = 2(A f + B) (5,18a) 

One very interesting aspect of FM is that = k f m p depends only on the peak value of m(t). 
It is independent of the spectrum of m(f). On the other hand, in PM, Aco = k p m p depends on the 
peak value of m(r). Butm(r) depends strongly on the spectral composition of m(t). The presence 
of higher frequency components in m(t) implies rapid time variations, resulting in a higher 
value of m p . Conversely, predominance of lower frequency components will result in a lower 
value of m p . Hence, whereas the FM signal bandwidth [Eq. (5.13)] is practically independent 
of the spectral shape of m(r), the PM signal bandwidth [Eq. (5*18)] is strongly affected by the 
spectral shape of m{t). For m(t) with a spectrum concentrated at lower frequencies. Bp M will 
be smaller than when the spectrum of m(t) is concentrated at higher frequencies. 

Spectral Analysis of Tone Frequency Modulation 

For an FM carrier with a generic message signal w(f), the spectral analysis requires the use of 
staircase signal approximation. Tone modulation is a special case for which a precise spectral 
analysis is possible: that is, when wi(r) is a sinusoid. We use this special case to verify the FM 
bandwidth approximation. Let 


rn(t) — a cos co m t 

From Eq. (5.7), with the assumption that initially a(—oo) = 0, we have 

tx 

a(t) — — sin o> m t 


Thus, from Eq, (5.8a), we have 


0 m (t) = A e il^+k f a/o Jm sin aw) 


Moreover 


Aoj — kfnip = akf 


* Equation (5.17a) can be applied only if m(f) is a continuous function of time. If m{t) has jump discontinuities, its 
derivative does not exist. In such a case, we should use the direct approach (discussed in Example 5.2) to find 
<p PM (ri and then determine A&j from ip PM f/). 
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and the bandwidth of w(f) is 2nB = <o m rad/s. The deviation ratio (or in this case, the 
modulation index) is 


Hence, 


Af = _ ctkf_ 

B 2izB co m 

p^(f) = ^ 

— A^ a ‘ c, (e^ sin 


(5-19) 


Note that i s a periodic signal with period 2n/co m and can be expanded by the 

exponential Fourier series, as usual, 

oo 

n=-oo 


gjfi M nt t — 


where 


D n 


In 


/ nfMm 

-TF/Otm 


sin to m t e ~jna> m t ^ 


_ jl _ r 

~ 2^ j_ jT 


^sinx-ftx)^ 


The integral on the right-hand side cannot be evaluated in a closed form but must be integrated 
by expanding the integrand in infinite series. This integral has been extensively tabulated and 
is denoted by the Bessel function of the first kind and the / 2 th order* These functions are 
plotted in Fig. 5.7a as a function of n for various values of /3. Thus, 


* in Wmt 


= 


(5.20) 


Substituting Eq. (5*20) into Eq. (5*19), we get 

oo 

<P m U)=A £ J n (J}')e )( - <0ct+w * mt> 

«=-oc 


and 


oo 

PfmW=4 ^ J n (P)cos(ct) c +nco m )t 

n=-oo 

The tone-modulated FM signal has a carrier component and an infinite number of sidebands 
of frequencies a> c ± co my cd c ± 2aj my ..., cd c ± nco m , ..., as shown in Fig* 5,7b, This is in stark 
contrast to the DSB-SC spectrum of only one sideband on either side of the carrier frequency* 
The strength of the rath sideband at co = <D c +na) m is* From the plots of J n (fi) in Fig* 5,7a, 
it can be seen that for a given decreases with n , and there are only a finite number 


* Also J-n(fi) = (— 1 ) n Jn (fi f Hence, the magnitude of the LSB at w = o> c - is the same as that of the USB at 
0 } = C0 C -f nco m . 
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Figure 5.7 

(a) Variations of 
-/n(W as a 

Function of 
ft for various 
values 

of 0, (b] Tone- 
modulated FM 
wave spectrum. 



of significant sideband spectral lines. It can be seen from Fig. 5,7a that is negligible for 
n > H- 1, Hence, the number of significant sideband impulses is fi + 1. The bandwidth of 
the FM carrier is given by 


Bfm = 2(fi + 1 )f m 

= 2(A f + B) 


which corroborates our previous result [Eqs. (5.13)]. When « 1 (NESFM), there is only one 
significant sideband and the bandwidth m = Zfm = 2B, It is important to note that this tone 
modulation case analysis is a verification, not a proof, of Carson's formula. 

In the literature, tone modulation in FM is often discussed in great detail. Since, however, 
angle modulation is a nonlinear modulation, the results derived for tone modulation may 
have little connection to practical situations. Indeed, these results are meaningless at best and 
misleading at worst when generalized to practical signals.* As authors and instructors, we feel 
that too much emphasis on tone modulation can be misleading. For this reason we have omitted 
further such discussion here. 

The method for finding the spectrum of a tone-modulated FM wave can be used for finding 
the spectrum of an FM wave when m{f) is a general periodic signal. In this case, 


* Ft >r instance, based on tone modulation analysis, it is often stated that FM is superior to PM by a factor of 3 in 
terms of the output SNR. This is in fact untrue for most of the signals encountered in practices. 
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Example 5.3 


Because a(t) is a periodic signal, e* k f a ^ is also aperiodic signal, which can be expressed as an 
exponential Fourier series in the preceding expression. After this, it is relatively straightforward 
to write ip FM (/) in terms of the carrier and the sidebands. 


(a) Estimate and #pm for the modulating signal m(t) in Fig. 5.4a for kf = 2 k x 10 5 
and k p = 5k. Assume the essential bandwidth of the periodic m(t) as the frequency of 
its third harmonic. 

(b) Repeat the problem if the amplitude of m(r) is doubled [if m(t) is multiplied by 2]. 


(a) The peak amplitude of m(t) is unity. Hence, m p — 1. We now determine the 
essential bandwidth B of m{t). It is left as an exercise for the reader to show that the 
Fourier series for this periodic signal is given by 


tn(t) — ^ C n cos no)§t coo — 


2k 


2 x lO " 4 


= 10 4 7T 


where 




-^ 5 —r n odd 
K~n z 

0 n even 


It can be seen that the harmonic amplitudes decrease rapidly with n. The third harmonic 
is only 11% of the fundamental, and the fifth harmonic is only 4% of the fundamental. 
This means the third and fifth harmonic powers are 1.21 and 0.16%, respectively, of 
the fundamental component power. Hence, we are justified in assuming the essential 
bandwidth of m(t) as the frequency of its third harmonic, that is, 

10 4 

5-3 x —- = 15kHz 
2 


For FM: 


Af=±- k$mp = 


A(2jt x 10 5 )(1) = 100 

2jt 


and 


£ FM =2(Af + B)= 230 kHz 


Alternatively, the deviation ratio $ is given by 

A/ _ 100 
P - B ~ 15 


and 


fiFM=2flC9+l) = 30 



= 230 kHz 
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For PM: The peak amplitude of m(t) is 20,000 and 

A f = ~k p m p = 50 kHz 

lit f F 


Hence, 


M - 2(A f + B) = 130 kHz 


Alternately, the deviation ratio /3 is given by 


4 £ 

B 1 


and 


Spm - 2 B(P + 1) = 30 + lj = 130 kHz. 

(b) Doubling m{t) doubles its peak value. Hence, m p — 2. But its bandwidth is unchanged 
so that B = 15 kHz. 

For FM: 


A/ = J-fym p = 2-(2 x x 10 s )(2) = 200kHz 

l7Z 2 71 

and 

Bfm =2(A f + B) = 430 kHz 

Alternately, the deviation ratio is given by 

M = 200 
P B 15 

and 

S FM = 2 B(P + 1) = 30^^ + 1^ = 430 kHz 

For PM: Doubling m{t) doubles its derivative so that now m p — 40,000, and 

A/ = —k p m p — 100 kHz 
2jt f 

and 


B ?M = 2(A/ + B) = 230 kHz 
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Alternately, the deviation ratio ft is given by 

A/ 100 




» 

+ 0 . 


% 


B 


15 


and 


tfpM = 2B(fi + 1) = 30 + 1^ = 230kHz 


Observe that doubling the signal amplitude [doubling m(/)] roughly doubles frequency 
deviation A f of both FM and PM waveforms. 


Example 5.4 Repeat Example 5.1 if m(t) is time-expanded by a factor of 2: that is, if the period of m{t) is 
4 x 10- 4 . 

I Recall that time expansion of a signal by a factor of 2 reduces the signal spectral width 
(bandwidth) by a factor of 2. We can verify this by observing that the fundamental fre¬ 
quency is now 2*5 kHz, and its third harmonic is 7*5 kHz. Hence, B = 7.5 kHz, which is 
half the previous bandwidth. Moreover, time expansion does not affect the peak amplitude 
and thus m p = l. However, m p is halved, that is, m p = 10,000* 

For FM: 

A f = — kf m D — 100 kHz 
2rt ' p 

B fU =2(A f+B) = 2(100 + 7.5) =215 kHz 

For FM: 

A/ = ^—kpfhp ~ 25 kHz 

2jT 

Bpm — 2( A/ -f B) = 65 kHz 

Note that time expansion of/n(r)has very little effect on the FM bandwidth, but ithalvesthe 
PM bandwidth. This verifies our observation that the PM spectrum is strongly dependent 
on the spectrum of m(r). 


Example 5.5 An angle-modulated signal with carrier frequency co c = 2jt x 10 5 is described by the equation 

<p FM (0 = 10cos (oj c t H- 5 sin 3000r + 10 sin 2000;rf) 

(a) Find the power of the modulated signal. 

(b) Find the frequency deviation A/. 

(c) Find the deviation ratio 
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(d) Find the phase deviation A0. 

(e) Estimate the bandwidth of <p FM (*)■ 

i The signal bandwidth is the highest frequency in m{t) (or its derivative). In this case 
I B — 2000jt /2jt = 1000 Hz. 


(a) The carrier amplitude is 10, and the power is 

„ 10 2 


= 50 


(b) To find the frequency deviation A/, we find the instantaneous frequency given by 


<*>/ = —0(t) = <*) c + 15,000 cos 3000f + 20, OOOjt cos 20007rf 
at 

The carrier deviation is 15,000 cos 3000* 4- 20, OOOjt cos 2000jrr. The two sinusoids 
will add in phase at some point, and the maximum value of this expression is 15,0004- 
20, OOOjt. This is the maximum carrier deviation A^. Hence, 

„ Aoj 

A/ = — = 12,38732 Hz 

2 7T 


(C) 



12,38732 

1000 


12387 


(d) The angle 0(r) = ojt + (5 sin 3000* + 10 sin 2000:rr), The phase deviation is the 
maximum value of the angle inside the parentheses, and is given by A0 = 15 rad. 


(e) Beu = 2<A/ 4- B) = 26,774.65 Hz 

Observe the generality of this method of estimating the bandwidth of an angle- 
modulated waveform. We need not know whether it is FM, PM, or some other kind 
of angle modulation. It is applicable to any angle-modulated signal. 


A Historical Note: Edwin H. Armstrong (1890-1954) 

Today, nobody doubts that FM has a key place in broadcasting and communication. As recently 
as the 1960s, however, the FM broadcasting seemed doomed because it was so uneconomical 
in bandwidth usage. 

The history of FM is full of strange ironies. The impetus behind the development of FM 
was the desire to reduce signal transmission bandwidth. Superficial reasoning showed that it 
was feasible to reduce the transmission bandwidth by using FM. But the experimental results 
showed otherwise. The transmission bandwidth of FM was actually larger than that of AM. 
Careful mathematical analysis by Carson showed that FM indeed required a larger bandwidth 
than AM. Unfortunately, Carson did not recognize the compensating advantage of FM in 
its ability to suppress noise. Without much basis, he concluded that FM introduces inherent 
distortion and has no compensating advantages whatsoever. 1 In a later paper, he continues 
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Edwin H. 
Armstrong, 
(Reproduced 
with permission 
from Armstrong 
Famity Archives ] 



“In fact, as more and more schemes are analyzed and tested, and as the essential nature of 
the problem is more clearly perceivable, we are unavoidably forced to the conclusion that 
static (noise), like the poor, will always be with us/' 2 The opinion of one of the most able 
mathematicians of the day in the communication field, thus, set back the development of FM 
by more than a decade. The noise-suppressing advantage of FM was later proved by Major 
Edwin FL Armstrong, 3 a brilliant engineer whose contributions to the field of radio systems 
are comparable to those of Flertz and Marconi. It was largely the work of Armstrong that was 
responsible for rekindling the interest in FM. 

Although Armstrong did not invent the concept, he has been considered the father of 
modem FM. Bom on December 18, 1890, in New York City, Edwin H. Armstrong is widely 
regarded as one of the foremost contributors to radio electronics of the twentieth centuiy. 
Armstrong was credited with the invention of the regenerative circuit (U.S. Patent 1,113,149 
issued in 1912, while he was a junior at Columbia University), the superheterodyne circuit 
(U.S. Patent 1,342,885 issued in 1918, while serving in theU.S. Army stationed in Paris, during 
World War I), the super-regenerative circuit (U.S. Patent 1,424,065, issued in 1922), and the 
complete FM radio broadcasting system (U.S. Patent 1,941,066, 1933). All are breakthrough 
contributions to the radio field. Fortune magazine in 1939 declared: Wideband frequency mod¬ 
ulation is the fourth, and perhaps the greatest, in a line of Armstrong inventions that have made 
most of modem broadcasting what it is. Major Armstrong is the acknowledged inventor of the 
regenerative ‘feedback' circuit, which brought radio art out of the crystal-detector headphone 
stage and made the amplification of broadcasting possible; the superheterodyne circuit, which 
is the basis of practically all modem radio; and the super-regenerative circuit now in wide use 
in ... shortwave systems. 4 

Armstrong was the last of the breed of the lone attic inventors. After receiving his FM 
patents in 1933, he gave his now famous paper (which later appeared in print as in the pro¬ 
ceedings of the IRE 3 ), accompanied by the first public demonstration of FM broadcasting 
on November 5, 1935, at the New York section meeting of the Institute of Radio Engineers 
(IRE, a predecessor of the IEEE). His success in dramatically reducing static noise using FM 
was not fully embraced by the broadcast establishment, which perceived FM as a threat to its 
vast commercial investment in AM radio. To establish FM broadcasting, Armstrong fought a 
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long and costly battle with the radio broadcast establishment, which, abetted by the Federal 
Communications Commission <FCC), fought tooth and nail to resist FM. Still, hy December 
1941,67 commercial FM stations had been authorized with as many as half a million receivers 
in use and 43 applications were pending. In fact, the Radio Technical Planning Board {RTPB) 
made its final recommendation during the September 1944 FCC hearing that FM be given 75 
channels in the band from 41 to 56 MHz. 

Despite the recommendation of the RTPB, which was supposed to be the best advice 
available from the radio engineering community, strong lobbying for the FCC to shift the FM 
band persisted, mainly by those who propagated the concern that strong radio interferences in 
the 40 MHz band might be possible as a result ionospheric reflection. Then in June 1945, the 
FCC, on the basis of erroneous testimony of a technical expert, abruptly shifted the allocated 
bandwidth of FM from the 42- to 50-MHz range to the 88- to 108-MHz, This dealt a crippling 
blow to FM by making obsolete more than half a million receivers and equipment {transmitters, 
antennas, etc.) that had been built and sold by the FM industry to 50 FM stations since 1941 
for the 42 to 50 MHz band. Armstrong fought the decision, and later succeeded in getting the 
technical expert to admit his error. In spite of all this, the FCC allocations remained unchanged. 
Armstrong spent the sizable fortune he had made from his inventions in legal struggles. The 
broadcast giants, which had so strongly resisted FM, turned around and used his inventions 
without paying him royalties. Armstrong spent much of his time in court in some of the longest, 
most notable, and acrimonious patent suits of the era. 5 In the end, with his funds depleted, his 
energy drained, and his family life shattered, a despondent Armstrong committed suicide: (in 
1954) he walked out of a window of his thirteenth floor apartment in New York’s River House. 

Armstrong’s widow continued the legal battles and won. By the 1960s, FM was clearly 
established as the superior radio system, 6 and Edwin H. Armstrong was fully recognized as the 
inventor of frequency modulation, In 1955 the ITU added him to its roster of great inventors. 
In 1980 Edwin H. Armstrong was inducted into the U.S. National Inventors Hall of Fame, and 
his picture was put on a U.S. postage stamp in 1983. 7 


5.3 GENERATING FM WAVES 

Basically, there are two ways of generating FM waves: indirect and direct. We first describe 
the narrowband FM generator that is utilized in the indirect FM generation of wideband angle 
modulation signals. 

NBFM Generation 

For NBFM and NBPM signals, we have shown earlier that because |£/fl(/)| <g 1 and 
|« 1, respectively, the modulated signals can be approximated by 

^nbfmM - A[cos o) c t - kfa(t) sin a> c t\ (5.21a) 

^nbpmW - ^[cos (o c t - kptn(i) sin co c t] (5.21b) 

Both approximations are linear and are similar to the expression of the AM wave. In fact, 
Eqs. (5.21) suggest a possible method of generating narrowband FM and PM signals by using 
DSB-SC modulators. The block diagram representation of such systems appears in Fig. 5.8. 

It is important to point out that the NBFM generated by Fig. 5 + 8b has some distortion 
because of the approximation in Eq. (5.10). The output of this NBFM modulator also has some 
amplitude variations. A nonlinear device designed to limit the amplitude of a bandpass signal 
can remove most of this distortion. 
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Figure 5*8 

(a) Narrowband 
PM generator, 

(b) Narrowband 
FM signal 
generator. 


Figure 5.9 

(a) Hard limiter 
and bandpass 
filter used to 
remove 
amplitude 
variations in FM 
wave, (b) Hard 
limiter input- 
output 

characteristic, 

(c) Hard limiter 
input and the 
corresponding 
output, (d) Hard 
limiter output as 
a function of 6. 
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Bandpass Limiter 

The amplitude variations of an angle-modulated carrier can be eliminated by what is known as 
a bandpass limiter, which consists of a hard limiter followed by a bandpass filter (Fig. 5.9a). 
The input-output characteristic of a hard limiter is shown in Fig. 5.9b. Observe that the bandpass 
limiter output to a sinusoid will be a square wave of unit amplitude regardless of the incoming 
sinusoidal amplitude. Moreover, the zero crossings of the incoming sinusoid are preserved 
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in the output because when the input is zero, the output is also zero (Fig. 5.9b). Thus an 
angle-modulated sinusoidal input v,(r) = A{t ) cos 0(0 results in a constant amplitude, angle- 
modulated square wave v G {t), as shown in Fig. 5.9c. As we have seen, such a nonlinear operation 
preserves the angle modulation information. When v 0 (t) is passed through a bandpass filter 
centered at <o Ct the output is a angle-moddated wave, of constant amplitude. To show this, 
consider the incoming angle-modulated wave 

V;(f) = A(f) cos 0(0 


where 


0(0 


(tict + kj 



m(a)da 


The output v fl (0 of the hard limiter is +1 or -l f depending on whether v E (0 = A(f)cos 0(0 
is positive or negative (Fig, 5.9c). Because A(0 > 0 ? v 0 (t) can be expressed as a function of 0: 


v*<0) = 



cos 0 > 0 
cos 0 < 0 


Hence, as a function of 0 is a periodic square wave function with period 2jr (Fig, 5.9d), 
which can be expanded by a Fourier series (Chapter 2) 


4 / 1 1 

v o {0) = — ( cos 0 — - cos 30 + - cos 50 + ■ - 
7i \ 3 5 

At any instant t, 0 = co c t + kj f m(a) dot. Hence, the output v 0 as a function of time is 
given by 


v £ ,[0(O] - v. 


+ k f / m(a) da j 

js a) c t + kf j m(a)da 

If f 

-b ~ cos 5 o> c t + kf j m{a)da 


cos 3 


o) c t + kf J m(a)da 


The output, therefore, has the original FM wave plus frequency-multiplied FM waves 

with multiplication factors of 3, 5, 7,-We can pass the output of the hard limiter through 

a bandpass filter with a center frequency a) c and a bandwidth Z?fm* as shown in Fig. 5.9a. The 
filter output e 0 (t) is the desired angle-modulated carrier with a constant amplitude, 

4 

e 0 {t) — — cos 

7T 

Although we derived these results for FM, this applies to PM (angle modulation in general) 
as well. The bandpass filter not only maintains the constant amplitude of the angle-modulated 
carrier but also partially suppresses the channel noise when the noise is small, 8 


tu c (f) + kj- J m(a)da 
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Indirect Method of Armstrong 

In Armstrong’s indirect method, NBFM is generated as shown in Fig. 5.8b [or Eq. (5,10)]. The 
NBFM is then converted to WBFM by using additional frequency multipliers. 

A frequency multiplier can be realized by a nonlinear device followed by a bandpass filer. 
First consider a nonlinear device whose output signal y(f) to an input x(f) is given by 

y(t) = a 2 x 2 (t) 


If an FM signal passes through this device, then the output signal will be 


y(0 = G 2 cos 2 


: 0.5^2 T- 0.5^2 cos 


o c t H- kf / m(o f) daj 
H- 2 kf 


m(a) da 


(5.22) 


Thus, a bandpass filter centered at 2(o c would recover an FM signal with twice the original 
instantaneous frequency. To generalize, a nonlinear device may have the characteristic of 

y(t) — gq^ G[X(t) -b G2X 1 (f) H - h a n x fl (t) (5.23) 


If x(0 = A cos [a) c t + kf j m(a) da ], then by using trigonometric identities, we can readily 
show that y(f) is of the form 


yO) = c 0 + ci cos 


+ kf j m(a) da j T Q cos 
H-4- c n cos A' nkf J m(ot) da 


■*f 


2a> c t + 2kf i m(a)da 


(5.24) 


Hence, the output will have spectra at co c , 2 ,..., na> c , with frequency deviations 
A/, 2A/,..., nAf , respectively. Each one of these components is an FM signal separated 
from the others. Thus, a bandpass filter centering at nco c can recover an FM signal whose 
instantaneous frequency has been multiplied by a factor of n. These devices, consisting of 
nonlinearity and bandpass filters, are known as frequency multipliers, Tn fact, a frequency 
multiplier can increase both the carrier frequency and the frequency deviation by an integer n. 
Thus, if we want a twelfth-fold increase in the frequency deviation, we can use a twelfth-order 
nonlinear device or two second-order and one third-order devices in cascade. The output has 
a bandpass filter centered at 12^, so that it selects only the appropriate term, whose carrier 
frequency as well as the frequency deviation A/ are 12 times the original values. 

This forms the basis of the Armstrong indirect frequency modulator. First, generate an 
NBFM approximately. Then multiply the NBFM frequency and limit its amplitude variation. 
Generally, we require to increase A/ by a very large factor «. This increases the carrier fre¬ 
quency also by «. Such a large increase in the carrier frequency may not be needed. In this case 
we can apply frequency mixing (see Example 4.2, Fig. 4.7) to shift down the carrier frequency 
to the desired value. 

A simplified diagram of a commercial FM transmitter using Armstrong’s method is shown 
in Fig. 5.10. The final output is required to have a carrier frequency of91.2MHzand A/ = 75 
kHz. We begin with NBFM with a carrier frequency f C] = 200 kHz generated by a crystal 
oscillator. This frequency is chosen because it is easy to construct stable crystal oscillators as 
well as balanced modulators at this frequency. To maintain 1, as required in NBPM, the 
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Figure 5.10 

Block diagram of 
the Armstrong 
indirect FM 
transmitter. 



deviation A/ is chosen to be 25 Hz. For tone modulation, / = A/// m . The baseband spectrum 
(required for high-fidelity purposes) ranges from 50 Hz to 15 kHz. The choice of A/ = 25 Hz 
is reasonable because it gives /? = 0.5 for the worst possible case (J m = 50). 

To achieve A/ — 75 kHz, we need a multiplication of 75,000/25 = 3000. This can be 
done by two multiplier stages, of 64 and 48, as shown in Fig. 5.10, giving a total multiplication 
of 64 x 48 = 3072* and A/ = 76.8 kHz.* The multiplication is effected by using frequency 
doublers and triplers in cascade, as needed. Thus, a multiplication of 64 can be obtained by six 
doublers in cascade, and a multiplication of 48 can be obtained by four doublers and a tripler 
in cascade. Multiplication of/ = 200 kHz by 3072, however, would yield a final carrier of 
about 600 MHz. This problem is solved by using a frequency translation, or conversion, after 
the first multiplier (Fig. 5.10). The first multiplication by 64 results in the carrier frequency 
f C2 = 200 kHz x 64 = 12.8 MHz, and the carrier deviation A= 25 x 64 = 1.6 kHz. We 
now use a frequency converter (or mixer) with carrier frequency 10.9 MHz to shift the entire 
spectrum. This results in a new carrier frequency /- 3 = 12.8 —10.9 = 1.9 MHz. The frequency 
converter shifts the entire spectrum without altering A/. Hence, A /3 = 1.6 kHz. Further 
multiplication, by 48, yields f C4 = 1.9 x 48 = 91.2 MHz and A /4 = 1.6 x 48 = 76.8 kHz. 

This scheme has an advantage of frequency stability, but it suffers from inherent noise 
caused by excessive multiplication and distortion at lower modulating frequencies, where 
A f jf m is not small enough. 


Example 5.6 Discuss the nature of distortion inherent in the Armstrong indirect FM generator. 


Two kinds of distortion arise in this scheme: amplitude distortion and frequency distortion. 
The NBFM wave is given by [Eq. (5.10)] 


(0 = — k/a(t) sin 

= AE (0 cos [o\t + 0 (0] 


where 

E(t) = ./l -h kja 2 (t) and 0(t) — tan -i [&/a(0] 


* If we wish A f to be exactly 75 kHz instead of 76.8 kHz, we must reduce the narrowband Af from 25 Hz to 
25(75/76.8) = 24.41 Hz. 
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:V Amplitude distortion occurs because the amplitude AE(t) of the modulated waveform is 
f not constant. This is not a serious problem because amplitude variations can be eliminated 
by a bandpass limiter, as discussed earlier in the section (see also Fig, 5,9). Ideally, 

£ should be kfa{t). Instead, the phase 0(f) in the preceding equation is 

^ 0(f) =tan“ 1 [Jt / tf(f)l 

"i and the instantaneous frequency Wi(t) is 


% 

% 


®/(o = m = 


kfa(t) 

1 + kja 2 {t) 


1 + kja 2 {t) 


= kfm{t)[\ - kja 2 (t) + kja A (t) - ] 


Ideally, the instantaneous frequency should be kfm(t ). The remaining terms in this equation 
are the distortion, 

Let us investigate the effect of this distortion in tone modulation where m(r) = 
u cos a(t) = a sin co m t/cD m , and the modulation index rf = akr 


% 




a>i(t) = fa m cos co m i( 1 - sin 2 a> m t + fi A sin 4 oj m t -) 

It is evident from this equation that the scheme has odd-harmonic distortion, the most 
important term being the third harmonic. Ignoring the remaining terms, this equation 
becomes 


eoi(t) ~ fa m cos co m t{ 1 — fir sin 2 oj m t) 


= far 


("?) 


COS Q) m t + 


$ <*>M 


cos 3 a> m t 


desired 


distortion 


The ratio of the third-harmonic distortion to the desired signal can be found for the 
generator in Fig. 5.10, For the NBFM stage, 


% PB = A/] = 25 Hz 

i£ 

VI Hence, the worst possible case occurs at the lower modulation frequency. For example, if 
p die tone frequency is only 50 Hz, then ft = 0*5. In this case the third-harmonic distortion 
S is 1/15, or 6.67%. 


Direct Generation 

In a voltage-controlled oscillator (VCO), the frequency is controlled by an external voltage* 
The oscillation frequency varies linearly with the control voltage. We can generate an FM 
wave by using the modulating signal m(t) as a control signal* This gives 


-h kfm(t) 
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One can construct a VCO using an operational amplifier and a hysteretic comparator, 9 (such 
as a Schmitt trigger circuit). Another way of accomplishing the same goal is to vary one of the 
Teactive parameters (C or L) of the resonant circuit of an oscillator. A reverse-biased semicon¬ 
ductor diode acts as a capacitor whose capacitance varies with the bias voltage. The capacitance 
of these diodes, known under several trade names (e.g., Varicap, Varactor, Voltacap), can be 
approximated as a linear function of the bias voltage m(t) over a limited range. In Hartley or 
Colpitt oscillators, for instance, the frequency of oscillation is given by 

1 

COi) = - 

y/LC 

If the capacitance C is varied by the modulating signal m(f), that is, if 

C = Co — km(t) 


then 


l 



1 f km(t) 1 km(t) 

v'Ea L 2C 0 J Co 

Here we have applied the Taylor series approximation 

(1 +jc)" 1 + nx |jc| 1 


with n = 1/2, Thus, 


COQ = CO c 


1 ■+ ■■■■■ ■■ where 

2 Cq J 




a> c -j- k/mit) with kf — 


koj c 

2Cb 


Because C = Co — hn(t ), the maximum capacitance deviation is 

2k f Com P 


AC = km p = 


Hence, 


AC ^ 2 kfm p 2A/ 

Cq &>c fc 

In practice, A f ff c is usually small, and, hence, AC is a small fraction of Cq, which helps limit 
the harmonic distortion that arises because of the approximation used in this derivation. 
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We may also generate direct FM by using a saturable core reactor, where the inductance 
of a coil is varied by a current through a second coil (also wound around the same core). This 
results in a variable inductor whose inductance is proportional to the current in the second coil. 

Direct FM generation generally produces sufficient frequency deviation and requires little 
frequency multiplication. But this method has poor frequency stability. In practice, feedback 
is used to stabilize the frequency. The output frequency is compared with a constant frequency 
generated by a stable crystal oscillator. An error signal (error in frequency) is detected and fed 
back to the oscillator to correct the error. 


Features of Angle Modulation 

FM (like angle modulation in general) has a number of unique features that recommend it 
for various radio systems. The transmission bandwidth of AM systems cannot be changed. 
Because of this, AM systems do not have the feature of exchanging signal power for trans¬ 
mission bandwidth. Pulse-coded modulation (PCM) systems (Chapter 6) have such a feature, 
and so do angle-modulated systems. Tn angle modulation, the transmission bandwidth can be 
adjusted by adjusting A/. For angle-modulated systems, the SNR is roughly proportional to 
the square of the transmission bandwidth B T . In PCM, the SNR varies exponentially with Bt 
and is, therefore, superior to angle modulation. 


Example 5.7 Design an Armstrong indirect FM modulator to generate an FM signal with carrier frequency 
973 MHz and A/ = 10.24 kHz. A NBFM generator of = 20 kHz and A f — 5 Hz is 
available. Only frequency doublers can be used as multipliers. Additionally, a local oscillator 
(LO) with adjustable frequency between 400 and 500 kHz is readily available for frequency 
mixing. 


Figure 5.11 

Designing an 
Armstrong 
indirect 
modulator. 



The modulator is shown in Fig. 5.11. We need to determine M 2 , and/^. First, the 
NBFM generator generates 

/c, = 20,000 and A/i = 5 

The final WBFM should have 

f eA = 97.3 x 10 6 A U = 10,240 

We first find the total factor of frequency multiplication needed as 


Mi-M 2 = ~ = 2048 = 2 11 
A/i 


(5.25) 
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Because only frequency doublers can be used, we have three equations: 

M { = 2 ni 
M 2 = 2 ni 

M] + «2 = 11 


It is also clear that 


fd = 2 Hl / C1 and f C4 = 2'% 


To find/ LO , there are three possible relationships: 


fc 3 —fc2 ^/lO “/lO f c 2 


Each should be tested to determine the one that will fall in 


400,000 </ LO < 500,000 


(a) First, we test/ C3 =f C2 ~f L0 - This ease leads to 

97.3 x 10 6 = 2" 2 (2 B1 / C , -/ L0 ) 

= 2 ,ii+ " 2 / Cl - 2" 2 / u , 

= 2 11 20 x 10 3 - 2 n % 


Thus, we have 


/lo = 2_rt2 ( 4 -®6 X 10 7 - 9.73 x 10 7 ) 


This is outside the local oscillator frequency range. 

(b) Next, we test f c?> =f C2 +/ L0 . This case leads to 

97.3 x 10 6 = 2 M2 (2 n \f C] +/ L0 ) 

= 2 n 20 x 10 3 + 2" 2 / l0 


Thus, we have 


/u, = 2”" 2 (5.634 x 10 7 ) 


If U 2 = 7, then/ L0 = 440 kHz, which is within the realizable range of the local oscillator, 

(c) If we choose f C2 = / L0 -™/ r2 , then we have 

97.3 x 10 6 =f L0 - 2 n n n 'f C) 

= 2” 2 / lo — 2* 1 (20 x 10 3 ) 



5,4 Demodulation of FM Signals 231 


& Thus, we have 

rWfl 

| / L0 = 2~ >l2 (l3.826 x 10 7 ) 

% 

$ No integer n 2 will lead to a realizable/ lo . 

sfi 

I 

£ Thus, the final design is M\ — 16, Mi = 128, and/ l0 = 440 kHz. 


5.4 DEMODULATION OF FM SIGNALS 

The information in an FM signal resides in the instantaneous frequency wi = qj c + £/m(f). 
Hence, a frequency-selective network with a transfer function of the form |//(/)| = 2 anf + h 
over the FM band would yield an output proportional to the instantaneous frequency 
(Fig. 5.12a)/ There are several possible circuits with such characteristics. The simplest among 
them is an ideal differentiator with the transfer function 


Figure 5.12 

(a) FM 
demodulator 
frequency 
response. 

(b] Output of a 
differentiator to 
the input FM 
wave, (c) FM 
demodulation by 
direct 

differentiation. 




«W') 

—;>i- 

d 


Envelope 

A +■ } 


dt 


detector 



(C) 


+ Provided the variations of are slow in comparison to the time constant of the network. 



232 


ANGLE MODULATION AND DEMODULATION 


If we apply tp m (f) to an ideal differentiator, the output is 




d f T f 1 

— — {A cos o) c t + kf f m(a) da 
dt \ _ J- oo 

A [oj c + kf?n(t)\ sin ^co c t + fy J fn(a)d{a) - iz 


(5.26) 


Both the amplitude and the frequency of the signal <p rM (r) are modulated (Fig. 5.12b), the 
envelope being A[(jo c kfm{t)\. Because A <o = kfm p < co c , we have co c + kj m(t) > 0 for 
all r, and m(t) can be obtained by envelope detection of <p^(0 (Fig, 5.12c). 

The amplitude A of the incoming FM carrier must be constant. If the amplitude A were 
not constant, but a function of time, there would be an additional term containing dA/dt on the 
right-hand side of Eq. (5.26). Even if this term were neglected, the envelope of <p m (r) would 
be A(t)[(o c + kftn{t) j, and the envelope-detector output would be proportional to /u(r)A(f), 
still leading to distortions. Hence, it is essential to maintain A constant. Several factors, such 
as channel noise and fading, cause A to vary. This variation in A should be suppressed via the 
bandpass limiter (discussed earlier in Sec. 53) before the signal is applied to the FM detector. 


Practical Frequency Demodulators 

The differentiator is only one way to convert frequency variation of FM signals into amplitude 
variation that subsequently can be detected by means of envelope detectors. One can use 
an operational amplifier differentiator at the FM receiver. On the other hand, the role of the 
differentiator can be replaced by any linear system whose frequency response contains a linear 
segment of positive slope. By approximating the ideal linear slope in Fig. 5.12a, this method 
is known as slope detection. 

One simple device would be an RC high-pass fi Iter of Fig. 533, The RC frequency response 
is simply 


fl(f)= ' n?r ^j27ifRC if lizfRC « 1 
1 +j2jifRC 

Thus, if the parameter RC is be very small such that its product with the carrier frequency 
<o c RC <£ 1, the RC filter approximates a differentiator. 

Similarly, a simple tuned RLC circuit followed by an envelope detector can also serve as 
a frequency detector because its frequency response \H(f) | below the resonance frequency 
co 0 = 1/VlC approximates a linear slope. Thus, such a receiver design requires that 


co e 


<&o = 


1 

Vic 


Figure 5.13 

(a) RC high-pass 
filter. 

|b) Segment of 
positive slope in 
amplitude 
response. 



(a) 
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Because the operation is on the slope of this method is also called slope detection. 

Since, however, the slope of | H (f)| is linear over only a small band, there is considerable 
distortion in the output. This fault can be partially corrected by a balanced discriminator 
formed by two slope detectors. Another balanced demodulator, the ratio detector, also widely 
used in the past, offers better protection against carrier amplitude variations than does the 
discriminator. For many years ratio detectors were standard in almost all FM receivers. 10 

Zero-crossing detectors are also used because of advances in digital integrated circuits. 
The first step is to use the amplitude limiter of Fig. 5.9a to generate the rectangular pulse 
output of Fig. 5.9c. The resulting rectangular pulse train of varying width can then be applied 
to trigger a digital counter. These are the frequency counters designed to measure the instan¬ 
taneous frequency from the number of zero crossings. The rate of zero crossings is equal to 
the instantaneous frequency of the input signal. 


FM Demodulation via PLL 

Consider a PLL that is in lock with input signal sin \a} c t + ^(r)J and output error signal e 0 {t). 
When the input signal is an FM signal. 


Oi(t) = 



m(a) da + — 


(5.27) 


then, 


& 0 (t)=kf f m(a)da + 0.5tt — B e (t) 

J —oo 

With PLL in lock we can assume a small frequency error 9 e (t) ^ 0. Thus, the loop filter output 
signal is 


i. i d r f T 

e 0 (t) — -0 o (t) = kf f m(a)da + 0.5jt - Q< 
c c dt |_ 


(f) 


kf 

— —mil) 

c 


(5.28) 


Thus, the PLL acts as an FM demodulator. If the incoming signal is a PM wave, then 
£ 0 {f) = k p m(t)fc. In this case we need to integrate e 0 (t) to obtain the desired signal m(t ). 

To more precisely analyze PLL behavior as an FM demodulator, we consider the case of 
a small error (linear model of the PLL) with H($) = L For this case, feedback analysis of the 
small-error PLL in Chapter 4 becomes 


8*(J) = _ . QiW = 


s+AKH(s) 


s+ AK 


If E 0 (s) and M C*) are Laplace transforms of e 0 (t) and m(f), respectively, then from Eqs. (5.27) 
and (5.28) we have 


©;(*) 


k f M(s) 

s 


and j©^) = 
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Hence, 


EJs) = 



AK 

s + AK 


M(s) 


Thus, the PLL output e fl (f) is a distorted version of m{t) and is equivalent to the output of a 
single-pole circuit (such as a simple RC circuit) with transfer function kj-AKfc(s 4- AAT) to 
which m(t) as the input. To reduce distortion, we must choose AK well above the bandwidth 
of m(t ), so that e a (t) ^ kfm(t)/c. 

in the presence of small noise, the behavior of the PLL is comparable to that of a frequency 
discriminator. The advantage of the PLL over a frequency discriminator appears only when 
the noise is large. 


5.5 EFFECTS OF NONLINEAR DISTORTION 
AND INTERFERENCE 

Immunity of Angle Modulation to Nonlinearities 

A very useful feature of angle modulation is its constant amplitude, which makes it less sus¬ 
ceptible to nonlinearities. Consider, for instance, an amplifier with second-order nonlinear 
distortion whose input x(t) and output y(0 are related by 

y(r) =ciq + a } x(t) + a 2 x 2 (t) H-h a n x n {t) 

Clearly, the first term is the desired signal amplification term, while the remaining terms are 
the unwanted nonlinear distortion. For the angle modulated signal 

x(r) = A cos [&> c r + i jr{i)\ 

trigonometric identities can be applied to rewrite the nonideal system output y(0 as 

y(t) — c 0 + c\ cos [o> c t + ^(r)J + c x 2 cos [2 oj c t H- 2y/(r)l 
+ ’ - ■ H- c n cos [na) c t + nijf(t)] 


Because sufficiently large co c makes each component of y(t) separable in frequency domain, 
a bandpass filter centered at a> c with bandwidth equaling to (or Z? PM ) can extract the 
desired FM signal component c\ cos[tu c r H- ^(r)] without any distortion. This shows that 
angle-modulated signals are immune to nonlinear distortions. 

A similar nonlinearity in AM not only causes unwanted modulation with carrier frequen¬ 
cies nco c but also causes distortion of the desired signal. For instance, if a DSB-SC signal 
m(t) cos oj c t passes through a nonlinearity y(0 = ax(t) H- the output is 


y(t) = a m(t) cos o> c t -f b m 3 (t) cos 3 co c t 


T , , 3 b * ' 

= \ -f — m j (0 


cos oj c t + -rrr(t) cos 3 oj c t 


Passing this signal through a bandpass filter still yields [a m{t)-\-{3b fA)m z (t)\ cos Observe 
the distortion component (3b/4)m?(t) present along with the desired signal a m(t). 
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Immunity from nonlinearity is the primary reason for the use of angle modulation in 
microwave radio relay systems, where power levels are high. This requires highly efficient 
nonlinear class C amplifiers. In addition, the constant amplitude of FM gives it a kind of 
immunity to rapid fading. The effect of amplitude variations caused by rapid fading can be 
eliminated by using automatic gain control and bandpass limiting. These advantages made FM 
attractive as the technology behind the first-gene rati on (IG) cellular phone system. 

The same advantages of FM also make it attractive for microwave radio relay systems. In 
the legacy analog long-haul telephone systems, several channels are multiplexed by means of 
SSB signals to form L-carrier signals. The multiplexed signals are frequency-modulated and 
transmitted over a microwave radio relay system with many links in tandem. In this application, 
however, FM is used not to reduce noise effects but to realize other advantages of constant 
amplitude, and, hence, NBFM rather than WBFM is used. 


Interference Effect 

Angle modulation is also less vulnerable than AM to small-signal interference from adjacent 
channels. 

Let us consider the simple case of the interference of an unmodulated carrier A cos <o c t 
with another sinusoid / cos (co c + oj)t. The received signal r(t) is 


r(t) — A cos oo c t + / cos (co c + o)t 

— (A + I cos cot) cos w c t — I sin cot sin co c t 
= E r {t) cos [ co c t + ^{f)l 


where 


fd(t) = tan" 


/ sin cot 


A H -1 cos cot 

When the interfering signal is small in comparison to the carrier (/ A), 


/ . 

fdit) - t sin cot 
A 


( 5 . 29 ) 


The phase of E r (t) cos [a> c t + V^(0J is ^(f), and its instantaneous frequency is co c -b Vv(r). 
If the signal E r (t) cos + 0^(0 j is applied to an ideal phase demodulator, the output yj(r) 
would be Similarly, the output y^(r) of an ideal frequency demodulator would be ^(f)- 
Hence, 


/ 


= 

— sin cot 

A 

for PM 

(5.30) 


I CO 

for FM 


— cos cot 

A 

(5.31) 


Observe that in either case, the interference output is inversely proportional to the carrier 
amplitude A . Thus, the larger the carrier amplitude A, the smaller the interference effect. This 
behavior is very different from that in AM signals, where the interference output is independent 
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Figure 5*14 

Effect of 
interference 
in PM, FM, and 
FM with 
preemphasfs- 
deemphasis 

mi 



of the carrier amplitude.* Hence, an gie-modu luted systems are much better than AM systems 
at suppressing weak interference (/ <g A). 

Because of the suppression of weak interference in FM, we observe what is known us the 
capture effect when listening to FM radios. For two transmitters with carrier frequency sep¬ 
aration less than the audio range, instead of getting interference, we observe that the stronger 
carrier effectively suppresses (captures) the weaker carrier. Subjective tests show' that an inter¬ 
ference level as low as 35 dB in the audio signals can cause objectionable effects. Hence, in 
AM, the interference level should be kept below 35 dB. On the other hand, for FM, because 
of the capture effect, the interference level need only be below 6 dB. 

The interference amplitude (I/A for PM and Ioj/A for FM) vs, co at the receiver output 
is shown in Fig. 5.14. The interference amplitude is constant for all a> in PM but increases 
linearly with co in FM.* 

Interference due to Channel Noise 

The channel noise acts as interference in un angle-modulated signal. We shall consider the 
most common form of noise, white noise, which has a constant power spectral density. Such a 
noise may be considered as a sum of sinusoids of all frequencies in the band. All components 
have the same amplitudes (because of uniform density). This means / is constant for all and 
the amplitude spectrum of the interference at the receiver output is as shown in Fig. 5.14. The 
interference amplitude spectrum is constant for PM, and increases linearly with ca for FM. 

Preemphasis and Deemphasis in FM Broadcasting 

Figure 5.14 shows that in FM, the interference (the noise) increases linearly with frequency, 
and the noise power in the receiver output is concentrated at higher frequencies. A glance at 
Fig. 4.18b shows that the PSD of an audio signal m{t) is concentrated at lower frequencies 
below 2T kHz. Thus, the noise PSD is concentrated at higher frequencies, where m(f) is 


* For instance, an AM signal with an interfering sinusoid / cos (<w t - 4 w)/ is given by 

r(t) = [A 4 m(r)]eos co c i 4 / cos (to c + ai)/ 

= [A + m{t) 4 / cos ox] cos w r t — I sin to! sin w c t 

The envelope of this signal is 

E(r) = {[A 4 m(t) 4 / cos to!] 2 4 f 2 sin 2 wr} 1 ' 2 ss A 4 m(t) 4 / cos oA I << A 

Thus the interference signal at the envelope detector output is / cos to!, which is independent of the carrier amplitude 
A. We obtain the same result when synchronous demodulation is used. We come to a similar conclusion for AM-SC 
systems. 

* The results in Eqs. (5.30) and (5.3 i) can be readily extended to more than one interfering sinusoid. The system 
behaves linearly for multiple interfering sinusoids provided their amplitudes are very small in comparison to the 
carrier amplitude. 
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Figure 5*15 

Preemphasis- 
deemphasls in 
an FM system. 



Figure 5* 16 

(a) Preemphasis 
filter and (b) its 
frequency 
response. 

(c) Deemphasis 
filter and (d) its 
frequency 
response. 



the weakest. This may seem like a disaster. But actually, in this very situation there is a 
hidden opportunity to reduce noise greatly. The process, shown in Fig. 5.15, works as follows. 
At the transmitter, the weaker high-frequency components (beyond 2.1 kHz) of the audio 
signal m(t) are boosted before modulation by a preemphasis filter of transfer function 
At the receiver, the demodulator output is passed through a deemphasis filter of transfer 
function H d (f ) = Thus, the deemphasis filter undoes the preemphasis by attenuating 

(deemphasizing) the higher frequency components (beyond 2.1 kHz), and thereby restores the 
original signal m(t). The noise, however, enters at the channel, and therefore has not been 
preemphasi 2 ed (boosted). However, it passes through the deemphasis filter, which attenuates 
its higher frequency components, where most of the noise power is concentrated (Fig* 5.14). 
Thus, the process of preemphasis-deemphasis (PDE) leaves the desired signal untouched but 
reduces the noise power considerably. 


Preemphasis and Deemphasis Filters 

Figure 5.14 provides an opportunity to preemphasis. The FM has smaller interference than PM 
at lower frequencies, while the opposite is true at higher frequencies. If we can make our system 
behave like FM at lower frequencies and behave like PM at higher frequencies, we will have 
the best of both worlds. This is accomplished by a system used in commercial broadcasting 
(Fig. 5.15) with the preemphasis (before modulation) and deemphasis (after demodulation) 
filters H p (f) and H d {f) shown in Fig. 5,16. The frequency f\ is 2,1 kHz, and /2 is typically 
30 kHz or more (well beyond audio range), so that /2 does not even enter into the picture. 
These filters can be realized by simple RC circuits (Fig. 5.16). The choice of/i = 2.1 kHz 
was apparently made on an experimental basis. It was found that this choice of/i maintained 



238 ANGLE MODULATION AND DEMODULATION 


the same peak amplitude m p with or without preemphasis. 11 This satisfied the constraint of a 
fixed transmission bandwidth. 

The preemphasis transfer function is 


Hpif) = K 


j2itj + (i) i 

jlitf + m 


where K, the gain, is set at a value of . Thus, 


H p if) = 


f OJ2 \ j2ltf + <D\ 
\a>] / jlitf + a)2 


For litf <&' co \, 


Hpif) ^ 1 


For frequencies a>i Inf « a) 2 . 


Hpif) — 


J2nf 

C0[ 


(5.32a) 


(5.32b) 


(5.32c) 


(5.32d) 


Thus, the preemphasizer acts as a differentiator at intermediate frequencies (2,1™15 kHz), 
which effectively makes the scheme PM over these frequencies. This means that FM with 
PDE is FM over the modulating-signal frequency range of 0 to 2.1 kHz and is nearly PM over 
the range of 2.1 to 15 kHz, as desired. 

The deemphasis filter Hd{f) is given by 




o>\ 

j2nf 4- oj\ 


Note that for lirf <£ , Hpif) — (j2jzf + Hence, ^ 1 over the 

baseband of 0 to 15 kHz. 

For historical and practical reasons, optimum PDE filters are not used in practice. It can 
be shown that the PDE enhances the SNR by 13.27 dB (a power ratio of 2T25)* 

The side benefit of PDE is improvement in the interference characteristics. Because the 
interference (from unwanted signals and the neighboring stations) enters after the transmitter 
stage, it undergoes only the deemphasis operation, not the boosting, or preemphasis. Hence, 
the interference amplitudes for frequencies beyond 2.1 kHz undergo attenuation that is roughly 
linear with frequency* 

The PDE method of noise reduction is not limited to FM broadcast. It is also used in 
audiotape recording and in (analog) phonograph recording, where the hissing noise is also 
concentrated at the high-frequency end. A sharp, hissing sound is caused by irregularities in 
the recording material. The Dolby noise reduction systems for audiotapes operates on the 
same principle, although the Dolby-A system is somewhat more elaborate. In the Dolby-B and 
Dolby-C systems, the band is divided into two subbands (below and above 3 kHz instead of 
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2*1 kHz). In the Dolby-A system, designed for commercial use, the bands are divided into four 
subbands (below SO Hz, 80~3 kHz, 3-9 kHz, and above 9 kHz)* The amount of preemphasis 
is optimized for each band. 

We could also use PDE in AM broadcasting to improve the output SNR* In practice, 
however, this is not done for several reasons* First, the output noise amplitude in AM is 
constant with frequency and does not increase linearly as in FM. Hence, the deemphasis does 
not yield such a dramatic improvement in AM as it does in FM* Second, introduction of PDE 
would necessitate modifications of receivers already in use. Third, increasing high-frequency 
component amplitudes (preemphasis) would increase interference with adjacent stations (no 
such problem arises in FM)* Moreover, an increase in the frequency deviation ratio fl at high 
frequencies would make detector design more difficult. 


5.6 SUPERHETERODYNE ANALOG AM/FM 
RECEIVERS 


The radio receiver used in broadcast AM and FM systems, is called the superheterodyne 
receiver (Fig* 5,17). It consists of an RF (radio-frequency) section, a frequency converter 
(Example 4*2), an intermediate-frequency (IF) amplifier, an envelope detector, and an audio 
amplifier. 

The RF section consists basically of a tunable filter and an amplifier that picks up the 
desired station by tuning the filter to the right frequency band* The next section, the frequency 
mixer (converter), translates the carrier from co c to a fixed IF frequency of o>if (see Example 
4.2 for frequency conversion)* For this purpose, the receiver uses a local oscillator whose 
frequency/LO is exactly /if above the incoming carrier frequency/,; that is, 


Jlo = fc 

The simultaneous tuning of the local oscillator and the RF tunable filter is done by one joint 
knob. Tuning capacitors in both circuits are ganged together and are designed so that the tuning 


Figure 5,17 

Superheterodyne 

receiver. 


[A + m(/)l cos [A + m{t )] cos w TF / 
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frequency of the local oscillator is always/ ]F Hz above the tuning frequency / of the RF filter. 
This means every station that is tuned in is translated to a fixed carrier frequency of/ F Hz by 
the frequency converter for subsequent processing at IF. 

This superheterodyne receiver structure is broadly utilized in most broadcast systems. The 
intermediate frequencies are chosen to be 455 kHz (AM radio), 10.7 MHz (FM radio), and 
38 MHz (TV reception). 

As discovered by Armstrong for AM signals, the translation of all stations to a fixed 
intermediate frequency (/J F = 455 kHz for AM) allows us to obtain adequate selectivity. Tt is 
difficult to design precise bandpass filters of bandwidth 10 kHz (the modulated audio spectrum) 
if the center frequency / is very high. This is particularly true in the case of tunable filters. 
Hence, the RF filter cannot provide adequate selectivity against adjacent channels. But when 
this signal is translated to an IF frequency by a converter, it is further amplified by an IF 
amplifier {usually a three-stage amplifier), which does have good selectivity. This is because 
the IF frequency is reasonably low; moreover, its center frequency is fixed and factory-tuned. 
Hence, the IF section can effectively suppress adjacent-channel interference because of its 
high selectivity. It also amplifies the signal for envelope detection. 

In reality, the entire selectivity is practically realized in the IF section; the RF section 
plays a negligible role. The main function of the RF section is image frequency sup¬ 
pression. As observed in Example 4.2, the output of the mixer, or converter, consists of 
components of the difference between the incoming (/) and the local oscillator frequen¬ 
cies (/Lo) (i e >/iF = 1 /lo -/!)■ Now, consider the AM example. If the incoming carrier 
frequency / £: = 1000 kHz, then/Lo =/ + /rf = 1000 H- 455 = 1455 kHz, But another carrier, 
with// = 1455 -F455 = 1910 kH 2 ? will also be picked up because the difference// —/ LO is 
also 455 kHz. The station at 1910 kHz is said to be the image of the station of 1000 kHz. 
AM stations that are 2/j F = 910 kHz apart are called image stations and both would appear 
simultaneously at the IF output, were it not for the RF filter at receiver input. The RF filter 
may provide poor selectivity against adjacent stations separated by 10 kHz, but it can provide 
reasonable selectivity against a station separated by 910 kHz. Thus, when we wish to tune in 
a station at 1000 kHz, the RF filter, tuned to 1000 kHz, provides adequate suppression of the 
image station at 1910 kHz. 

The receiver (Fig. 5 J 7) converts the incoming carrier frequency to the IF by using a local 
oscillator of frequency f LO higher than the incoming carrier frequency and, hence, is called 
a superheterodyne receiver. We pick /lo higher than/: because this leads to a smaller tuning 
ratio of the maximum to minimum tuning frequency for the local oscillator. The AM broadcast- 
band frequencies range from 530 to 1710 kHz. The superheterodyne/Lo ranges from 1005 to 
2055 kHz (ratio of 2.045), whereas the subheterodyne range of/Lo would be 95 to 1145 kHz 
(ratio of 12,05). It is much easier to design an oscillator that is tunable over a smaller frequency 
ratio. 

The importance of the superheterodyne principle in radio and television broadcasting can¬ 
not be overstressed. In the early days (before 1919/ the entire selectivity against adjacent 
stations was realized in the RF filter. Because this filter often had poor selectivity, it was nec¬ 
essary to use several stages (several resonant circuits) in cascade for adequate selectivity. In 
the earlier receivers each filter was tuned individually. It was very time-consuming and cum¬ 
bersome to tune in a station by bringing all resonant circuits into synchronism. This task was 
made easier as variable capacitors were ganged together by mounting them on the same shaft 
rotated by one knob. But variable capacitors are bulky, and there is a limit to the number that 
can be ganged together. These factors, in turn, limited the selectivity available from receivers. 
Consequently, adjacent carrier frequencies had to be separated widely, resulting in fewer fre¬ 
quency bands. It was the superheterodyne receiver that made it possible to accommodate many 
more radio stations. 
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5.7 FM BROADCASTING SYSTEM 


The FCC has assigned a frequency range of 88 to 108 MHz for FM broadcasting, 
with a separation of 200 kHz between adjacent stations and a peak frequency deviation 
Af = 75 kHz. 

A monophonic FM receiver is identical to the superheterodyne AM receiver in Fig, 5 .1 7, 
except that the intermediate frequency is 10.7 MHz and the envelope detector is replaced by 
a PLL or a frequency discriminator followed by a deemphasizer. 

Earlier FM broadcasts were monophonic. Stereophonic FM broadcasting, in which two 
audio signals, L (left microphone) and R (right microphone), are used for a more natural effect, 
was proposed later. The FCC ruled that the stereophonic system had to be compatible with 
the original monophonic system. This meant that the older monophonic receivers should be 
able to receive the signal L + R, and the total transmission bandwidth for the two signals (L 
and R) should still be 200 kHz, with Af = 75 kHz for the two combined signals. This would 
ensure that the older receivers could continue to receive monophonic as well as stereophonic 
broadcasts, although the stereo effect would be absent 

A transmitter and a receiver for a stereo broadcast are shown in Fig, 5.18a and c. At the 
transmitter, the two signals L and R are added and subtracted to obtain L + R and L - R . These 
signals are preemphasized. The preemphasized signal (L — Rf DSB-SC modulates a carrier 
of 38 kHz obtained by doubling the frequency of a 19-kHz signal that is used as a pilot. The 
signal (L + R) f is used directly. All three signals (the third being the pilot) form a composite 
baseband signal m(t) (Fig. 5.18b), 


. , co r t 

m(t) ~ (L-\- R) 4- (L — R) cos co c i + a cos — (5.33) 

The reason for using a pilot of 19 kHz rather than 38 kHz is that it is easier to sep¬ 
arate the pilot at 19 kHz because there are no signal components within 4 kHz of that 
frequency. 

The receiver operation (Fig. 5.18c) is self-explanatory, A monophonic receiver consists of 
only the upper branch of the stereo receiver and, hence, receives only L + R. This is of course 
the complete audio signal without the stereo effect. Hence, the system is compatible. The pilot 
is extracted, and (after doubling its frequency) it is used to demodulate coherently the signal 
(L — R) f c os a) c t. 

An interesting aspect of stereo transmission is that the peak amplitude of the composite 
signal m{t) in Eq. (5.33) is practically the same as that of the monophonic signal (if we ignore 
the pilot), and, hence, Af —which is proportional to the peak signal amplitude for stereophonic 
transmission—remains practically the same as for the monophonic case. This can be explained 
by the so-called interleaving effect as follows. 

The L' and 7?' signals are very similar in general. Hence, we can assume their peak ampli¬ 
tudes to be equal to A p . Under the worst possible conditions, U and R f will reach their peaks 
at the same time, yielding [Eq. (5.33)] 

MOUax = 2 Ap+a 

In the monophonic case, the peak amplitude of the baseband signal {L + RY is 2A P . Hence, the 
peak amplitudes in the two cases differ only by a, the pilot amplitude. To account for this, the 
peak sound amplitude in the stereo case is reduced to 90% of its full value. This amounts to a 
reduction in the signal power by a ratio of (0.9) 2 = 0.81, or 1 dB, Thus, the effective SNR is 
reduced by 1 dB because of the inclusion of the pilot. 
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Figure 5.18 

(a) FM stereo 
transmitter (b) 
Spectrum of a 
baseband stereo 
signal, (c) FM 
stereo receiver. 







(c) 


5.8 MATLAB EXERCISES 

In this section, we use MATLAB to build an FM modulation and demodulation example. The 
MATLAB program is given by ExampleFM.m, Once again use apply the same message 
signal ni 2 (0- The FM coefficient is kj = 80 and the PM coefficient is kp = it, The carrier 
frequency remains 300 Hz - The resulting FM and PM signals in the time domain are shown in 
Fig. 5 + 19 + The corresponding frequency responses are also shown in Fig, 5.19. The frequency 
domain responses clearly show the much higher bandwidths of the FM and PM signals when 
compared with amplitude modulations. 


% (ExampleFM.m) 

% This program uses triangl.m to illustrate frequency modulation 
% and demodulation 
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Figure 5.19 
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ts = l,e-4; 

t = -Q . 04 : ts : 0,04 
Ta=O.01; 

m_sig=triangl((t+0.01}/Ta)-triangl({t-0.01)/Ta); 
Lfft=length(t): Lfft=2"ceil[log2(Lfft)); 

M_fre=fftshift(fft(m_sig,Lfft )); 
freqm=(-Lfft/2:Lfft/2-1}/(Lfft*ts); 

00; ^Bandwidth of the signal is B_m Hz. 

% Design a simple lowpass filter with bandwidth B_m Hz. 
h-f irl ( SO H ts ] ) ; 

% 

kf=160*pi; 

m_intg-kf*ts*cumsum(m_sig) ; 
s_fm=cos(2*pi*300*t+m_intg); 
s_pm=cos {2*pi*300*t+pi*m_sig} ; 

Lfft=length(t); Lfft=2"ceil(log2(Lfft)+1); 
S_fm=fftshift(fft(s_fm,Lfft)); 

S_j?m=f f tshif t (fft (s__pm, Lfft) ) ; 
f r eqs =(-Lfft/2:Lfft/2-l)/(Lfft*ts) ; 

s_fmdem=diff ( [s_fm(l) s_fm] ) /ts/kf ; 
s_fmrec=s_fmdem.*(s_fmdem>0); 
s_dec=filter(h,1,s_fmrec); 


% Demodulation 

% Using an ideal LPF with bandwidth 200 Hz 
Trangel=[-0.04 0.04 -1.2 1.2]; 
figure(1) 

subplot(211);ml=plot{t,m_sig); 

axis(Trangel); set(ml,'Linewidth',2); 

xlabel('{\it t) (sec)'); ylabel('{\it m}({\it t})'); 

title( r Message signal'); 

subplot(212};m2 =plot(t,s_dec); 

set(m2,' Linewidth', 2) ; 

xlabelC"f\it t} (sec)'); ylabel(' (\it m}_d({\it t})') 
title['demodulated FM signal'); 

figure(2) 

subplot(211);tdl=plot(t, s_fm) ; 

axis(Trangel); set(tdl,'Linewidth',2); 

xlabel( r {\it t} (sec)'); ylabel('{\it s}_{\rm FM}((\it t))') 

title('FM signal'); 

subplot(212);td2=plot(t,s_pm); 

axis(Trangel); set(td2,'Linewidth', 2) ; 

xlabel{'{\it t) (sec)'); ylabel('{\it s}_{\rm PH}({\it t})") 
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title('PM signal'}; 
figure{3) 

subplot(211);fpl=p!ot(t,S„fmdem); 
set(fpl,'Linewidth',2); 

xlabel('{\it t} (sec)'); ylabel( ' {\ i t d s}_{\rm FM}{{\it t})/dt') 

title('FM derivative'); 

subplot(212);fp2=plot(t,s_fmrec); 

set (fp2, 'Linewidth' , 2) ; 

xlabel('{\it t} (sec)'); 

title('rectified FM derivative'); 

Frange=l>600 600 0 300]; 
figure(4) 

subplot(211};fdl=plot(freqs,abs(S_fm)}; 
axis(Frange); set(fdl,'Linewidth' , 2) ; 

xlabel(' {\it f} (Hz)'); ylabel('{\it S}_{\rm FM}({\it f}) H ) 
title('FM amplitude spectrum'); 
subplot (212 } ; fd2=plot. ( f regs, abs (S_pm) ) ; 
axis(Frange}; set (fd2, 'Linewidth',2); 

xlabel('{\it f} (Hz) 1 ); ylabel('{\it S}_{\rm PM}({\it f})') 
title('PM amplitude spectrum'); 


To obtain the demodulation results (Fig* 5.20), a differentiator is first applied to change the 
frequency-modulated signal into an amplitude- and frequency-modulated signal (Fig, 5.20). 


Figure 5,20 

Signals at the 
demodulator: 

|a) after 
differentiator; 

(b) after rectifier. 


FM derivative 



/(sec) 


Rectified FM derivative 



/(sec) 
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Figure 5.21 

FM modulation 
and 

demodulation: 

(a) original 
message; 

[b] recovered 
signal. 


Message signal 



t (see) 


Upon applying the rectifier for envelope detection, we see that the message signal follows 
closely to the envelope variation of the rectifier output. 

Finally, the rectifier output signal is passed through a low-pass filter with bandwidth 
100 Hz* We used the finite impulse response low-pass filter of order 80 this time because of 
the tighter filter constraint in this example. The FM detector output is then compared with the 
original message signal in Fig. 5*21* 

The FM demodulation results clearly show some noticeable distortions* First, the higher 
order low-pass filter has a much longer response time and delay. Second, the distortion dur¬ 
ing the negative half of the message is more severe because the rectifier generates very few 
cycles of the half-sinusoid. This happens because when the message signal is negative, the 
instantaneous frequency of the FM signal is low. Because we used a carrier frequency of only 
300 Hz, the effect of low instantaneous frequency is much more pronounced. Tf a practical 
carrier frequency of 100 MHz were applied, this kind of distortion would be completely 
negligible. 
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PROBLEMS 


5.1-1 


Sketch 0 > FM (O and ^ PM (0 for the modulating signal mit) shown in Fig, P5.1-1, given ca c = 
10 8 , kf = 10 5 , and k p = 25. 


Figure P.5,1-1 



5.1-2 A baseband signal m(t) is the periodic sawtooth signal shown in Fig, P5,l-2. 

(a) Sketch ^ FM (0 and (p pM {r) for this signal m(t) if a> c = 2tt x 10 6 , kf = 2000jt, andJtjj = jt/2. 

(b) Show that the PM signal is equivalent to a PM signal modulated by a rectangular periodic 
message. Explain why it is necessary to use k p < iz in this case. [Note that the PM signal 
has a constant frequency but has phase discontinuities corresponding to the discontinuities 
of 



5.1-3 Over an interval |r| < 1 , an angle-modulated signal is given by 

a? em (0 = 10 cos 13,000jrr 
It is known that the carrier frequency u) c = I0,000jr. 

(a) If this were a PM signal with k p = 1000, determine m(t) over the interval |r| < L 

(b) If this were an FM signal with kf = 1000, determine m{t) over the interval |r| < 1 . 
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5.2- 1 For a message signal 

m(t) = 2 cos 100r + 18 cos 2000jrr 

(a) Write expressions (do not sketch) for <p PM (r) and when A = 10, = 10 6 , kf - 

lOOOvT, and k p = 1. For determining use the indefinite integral of \ that is, take 

the value of the integral at t — — oo to be 0. 

(b) Estimate the bandwidths of> hM (/) and ^ PM (0^ 

5.2- 2 An angle-modulated signal with carrier frequency — 2jt x 10 6 is described by the equation 

<P EM cos (oj c t + 0,1 sin 2000 ,t/) 

(a) Find the power of the modulated signal. 

(b) Find the frequency deviation A/. 

(c) Find the phase deviation A0. 

(d) Estimate the bandwidth of (f), 

5.2- 3 Repeat Prob. 5.2-2 if 

<P EM (0 = 5 cos (co c t + 20 sin IOOGth + 10 sin 2000777) 

5.2- 4 Estimate the bandwidth for <p m (0 and <p FM (r) in Prob. 5.1-1. Assume the bandwidth of m{t) in 

Fig. P5.1-1 to be the third-harmonic frequency of m(t ). 

5.2- 5 Estimate the bandwidth for p PM (r) and ^(f) in Prob, 5,1-2. Assume the bandwidth of m(t) in 

Fig. P5.1-1 to be the fifth-harmonic frequency of m(t). 

5.2- 6 Given m(t) = sin 2000777, kf = 200,000^, and k p = 10. 

(a) Estimate the bandwidths of ^(r) and ^(f). 

(b) Repeat part (a) if the message signal amplitude is doubled. 

(c) Repeat part (a) if the message signal frequency is doubled. 

(d) Comment on the sensitivity of FM and PM bandwidths to the spectrum of m(t). 

5.2- 7 Given m{t) = f c = 10 4 Hz, kf = 6000tt, and k p = SOOGtt: 

(a) Find A/, the frequency deviation for FM and PM. 

(b) Estimate the bandwidths of the FM and PM waves. 

Hint: Find M if) and find its 3 dB bandwidth (B < A/). 

5.3- 1 Design (only the block diagram) an Armstrong indirect FM modulator to generate an FM carrier 

with a carrier frequency of 98.1 MHz and A/ = 75 kHz. A narrowband FM generator is available 
at a carrier frequency of 100 kHz and a frequency deviation Af = 10 Hz. The stock room also 
has an oscillator with an adjustable frequency in the range of 10 to 11 MHz. There are also plenty 
of frequency doublers, triplers, and quintuples. 

5.3- 2 Design (only the block diagram) an Armstrong indirect FM modulator to generate an FM carrier 

with a carrier frequency of 96 MHz and Af - 20 kHz. A narrowband FM generator with/^ = 200 
kHz and adjustable Af in the range of 9 to 10 Hz is available. The stock room also has an oscillator 
with adjustable frequency in the range of 9 to 10 MHz* There is a bandpass filter with any center 
frequency, and only frequency doublers are available. 
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Figure P.5.4-1 


Figure P.5.4-2 


5.4-1 (a) Show that when m(t) has no jump discontinuities, an FM demodulator followed by an inte¬ 
grator (Fig, P5,4-la) forms a PM demodulator. Explain why it is necessary for the FM 
demodulator to remove any dc offset before the integrator. 

(b) Show that a PM demodulator followed by a differentiator (Fig. P5.4-1 b) serves as an FM 
demodulator even if m(t) bas jump discontinuities or if the PM demodulator output has dc 
offset. 



(a) PM demodulator 



(b) FM demodulator 


5.4-2 A periodic square wave m(l) (Fig. P5,4-2a) frequency-modulates a carrier of frequency f c = 
10 kHz with A/ = 1 kHz. The carrier amplitude is A. The resulting FM signal is demodulated, 
as shown in Fig. P5,4-2b by the method discussed in Sec. 5.4 (Fig. 5.12). Sketch the waveforms 
at points b> c, d , and e> 


m(t) 

L. _ 'T . 



r 1 
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1 






-1 
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(a) 


m{t) 

FM 


modulator 
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dt 
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detector 


blocking 

© 


Demodulator 


(b) 


5.4-3 Use small-error PLL analysis to show that a first-order loop [H (s) — 1] cannot track an incoming 
signal whose instantaneous frequency is varying linearly with time [0,-(f) = kt 2 Y This signal can 
be tracked within a constant phase if H fa) = (s H- a)}s. It can be tracked with a zero phase error 
if H{s) = (i 2 4- as + b)/s 2 . 

5.6-1 A transmitter transmits an AM signal w ith a carrier frequency of 1500 kHz. When an inexpensive 
radio receiver (which has a poor selectivity in its RF-stage bandpass filter) is tuned to 1500 kHz, 
the signal is heard loud and clear This same signal is also heard (not as well) at another dial 
setting. State, with reasons, at what frequency you will hear this station. The IF frequency is 
455 kHz. 






250 ANGLE MODULATION AND DEMODULATION 


5*6-2 Consider a superheterodyne FM receiver designed to receive the frequency band of I to 30 MHz 
with an IF frequency 8 MHz. What is the range of frequencies generated by the local oscillator for 
this receiver? An incoming signal with a carrier frequency of 10 MHz is received at the 10 MHz 
setting. At this setting of the receiver, we also get interference from a signal with some other 
carrier frequency if the receiver RF-stage bandpass filter has poor selectivity. What is the carrier 
frequency of the interfering signal? 



/. SAMPLING AND 
U ANALOG-TO-DIGITAL 
CONVERSION 


A s briefly discussed in Chapter 1, analog signals can be digitized through sampling and 
quantization. This ana log-to-digital (A/D) conversion sets the foundation of modem 
digital communication systems. In the A/D converter, the sampling rate must be large 
enough to permit the analog signal to be reconstructed from the samples with sufficient accu¬ 
racy. The sampling theorem, which is the basis for determining the proper (lossless) sampling 
rate for a given signal, has played a huge role in signal processing, communication theory, and 
A/D circuit design. 


6.1 SAMPLING THEOREM 


We first show that a signal g(t) whose spectrum is banddimited to B Hz, that is, 

G(/) = 0 for |/| > B 

can be reconstructed exactly (without any error) from its discrete time samples taken uniformly 
at a rate of R samples per second. The condition is that R > 2B. In other words, the minimum 
sampling frequency for perfect signal recovery is / = 2 B Hz. 

To prove the sampling theorem, consider a signal g (/) (Fig. 6. la) whose spectrum is band- 
limited to B Hz (Fig* 6 + lb).* For convenience, spectra are shown as functions of/ as well as 
of co . Sampling g(t) at a rate of/ Hz means that we take/ uniform samples per second* This 
uniform sampling can be accomplished by multiplying git) by an impulse train <5^(0 of Fig. 
6Tc, consisting of unit impulses repeating periodically every T s seconds, where T s = 1//. 
This results in the sampled signal g(t) shown in Fig. 6.Id. The sampled signal consists of 
impulses spaced every T s seconds (the sampling interval). The nth impulse, located at t — nT s , 
has a strength g(nT s ) which is the value of g(r) at t = nT s , Thus, the relationship between the 


+ The spectrum G(/) in Fig. 6.1b is shown as real, for convenience. Our arguments are valid for complex G{f ). 
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Figure 6.1 

Sampled signal 
and its Fourier 
spectra. 
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sampled signal g(t) and the original analog signal g{t) is 

1(0 =s(0StJ 0 = ^g(nT,)S(r~nT s ) (6,1) 

n 


Because the impulse train St s U) is a periodic signal of period T„, it can be expressed as 
an exponential Fourier series, already found in Example 3.11 as 

&T s (t) = ^r ^2 e ' na> * t Wj- = ^ = 2nfs (6.2) 

i S I V 

J n-= — nr- * 


Therefore, 


£ 0 ) = *( 05^(0 


(6.3) 


To find {/(/}, the Fourier transform of g(t) t we take the Fourier transform of the summation 
in Eq. (63). Based on the frequency-shifting property, the transform of the nth term is shifted 
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by nf s . Therefore, 


Gif) = r £ ~ 


{6.4} 


This means that the spectrum G(f) consists of G(f ), scaled by a constant 1 fT s , repeating 
periodically with period/* = 1 jT s Hz, as shown in Fig. 6*le. 

After uniform sampling that generates a set of signal samples {g(£T F )], the vital question 
becomes: Can g(0 be reconstructed from g(t) without any loss or distortion? If we are to 
reconstruct g_(f) from g(r), equivalently in the frequency domain we should be able to recover 
G(/} from G(f). Graphically from Fig. 6.1, perfect recovery is possible if there is uo overlap 
among the replicas in G(/). Figure 6. le clearly shows that this requires 


fs > 28 

Also, the sampling interval T s = 1 //. Therefore, 

r 1 
Tc < — 

1 2 B 


(6.5) 


( 6 - 6 ) 


Thus, aj^long as the sampling frequency f s is greater than twice the signal bandwidth B (in 
hertz), G(f) will consist of nonoverlapping repetitions of G( f )* When this is true, Fig. 6.1e 
shows that g(r) can be recovered from its samples g(t) by passing the sampled signal g(r) 
through an ideal low-pass filter of bandwidth B Hz* The minimum sampling rate/* = 28 
required to recover g(f) from its samples g(t) is called the Nyquist rate for g(r), and the 
corresponding sampling interval T * = 1 j2B is called the Nyquist interval for the low-pass 
signal g(f)* 

We need to stress one important point regarding the possibility of/ = 2 B and a particular 
class of low-pass signals* For a general signal spectrum, we have proved that the sampling 
rate/ > IB. However, if the spectrum G(f) has no impulse (or its derivatives) at the highest 
frequency B , then the overlap is still zero as long as the sampling rate is greater than or equal 
to the Nyquist rate, that is. 


fs>2B 

If, on the other hand, G(f) contains an impulse at the highest frequency ±B , then the equality 
must be removed or else overlap will occur. In such case, the sampling rate/ must be greater 
than 2 B Hz. A well-known example is a sinusoid g(/) = sin 2ttB(£ — ft). This signal is band- 
limited to B Hz, but all its samples are zero when uniformly taken at a rate/ = 2 B (starting 
at t = to), and g(t) cannot be recovered from its Nyquist samples* Thus, for sinusoids, the 
condition of/ > 2 B must be satisfied. 

6.1.1 Signal Reconstruction from Uniform Samples 

The process of reconstructing a continuous time signal g(t) from its samples is also known as 
interpolation* In Fig. 6.1, we used a constructive proof to show that a signal g(r) band-limited 


* The theorem stated here (and proved subsequently) applies to low-pass signals, A bandpass signal whose spectrum 
exists over a frequency band f c — B/2 < \ f\ < f c + B/2 has a bandwidth B Hz, Such a signal is also uniquely 
determined by samples taken at above the Nyquist frequency 2 B. The sampling theorem is generally more complex 
in such case. It uses two interlaced uniform sampling trains, each at half the overall sampling rate R s > B , See, for 
example, the Refs. 1 and 2, 
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to B Hz can be reconstructed (interpolated) exactly from its samples. This means not only 
that uniform sampling at above the Nyquist rate preserves all the signal information, but also 
that simply passing the sampled signal through an ideal low-pass filter of bandwidth B Hz 
will reconstruct the original message. As seen from Eq. (6.3), the sampled signal contains a 
component (1/T s )g(/), and to recover g(r) [or G(/)], the sampled signal 

i(t) = - nT s ) 

must be sent through an ideal low-pass filter of bandwidth B Hz and gain T s . Such an ideal 
filter response has the transfer function 

Hif) = r ‘ n (£b) - ^ n ((b) < 6 - 7 > 


Ideal Reconstruction 

To recover the analog signal from its uniform samples, the ideal interpolation filter transfer 
function found in Eq. (6.7) is shown in Fig. 6.2a. The impulse response of this filter, the inverse 
Fourier transform of //{/), is 


h(t) = 2BT S sine (IjrBt) (6.8) 

Assuming the use of Nyquist sampling rate, that is, 2 BT S — 1, then 

h(t) = sine {litBt) (6.9) 

This h{t) is shown in Fig, 6.2b. Observe the very interesting fact that h(t) = 0 at all Nyquist 
sampling instants (r = ±n/2B) except r = 0. When the sampled signal g(t) is applied at 
the input of this filter, the output is g{t). Each sample in g(r), being an impulse, generates a 
sine pulse of height equal to the strength of the sample, as shown in Fig. 6.2c t The process is 


Figure 6.2 

Ideal 

interpolation. 
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identical to that shown in Fig* 6*6, except that h{t) is a sine pulse instead of a rectangular pulse* 
Addition of the sine pulses generated by all the samples results in g(r)* The fcth sample of the 
input g(t) is the impulse g (k7$)$ {t — £T a ); the filter output of this impulse is g(kT s )h(t — kT s ). 
Hence, the filter output to g(t ), which is g(t ), can now be expressed as a sum, 

g(t) = J^ g (kT s )h(t-kT s ) 

k 

= sine [2 jrB(t - £7^)1 (6*10a) 

it 

= ^g(fcT T ) sine (2 j iBt — kn) (6.10b) 

it 

Equation (6.10) is the interpolation formula, which yields values ofg(r) between samples as 
a weighted sum of all the sample values* 


Example 6.1 


Find a signal g(t) that is band-limited to B Hz and whose samples are 


2(0} = 1 and g(±T s ) = g(±2T s ) = g(±3T s )^-^0 


where the sampling interval T x is the Nyquist interval for g(t ) ( that is, T s = 1 }2B. 


Figure 6.3 

Signal recon¬ 
structed from the 
Nyquist samples 
in Example 6*1 * 


We use the interpolation formula (6.10b) to construct g(t) from its samples* Since all 
but one of the Nyquist samples are 2 ero, only one term (corresponding to k = 0) in the 
summation on the right-hand side of Eq* (6,10b) survives* Thus, 

g(t) = sine (IjzBt) (6.11) 

This signal is shown in Fig* 6*3. Observe that this is the only signal that has a bandwidth 
B Hz and sample values g(0) — 1 and ginT^) = 0 {n ^ 0). No other signal satisfies these 
conditions. 



Practical Signal Reconstruction (Interpolation) 

We established in Sec. 3.5 that the ideal low-pass filter is noncausal and unrealizable. This can 
be equivalently seen from the infinitely long nature of the sine reconstruction pulse used in 
the ideal reconstruction of Eq* (6*10)* For practical application of signal reconstruction (e.g., a 
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Figure 6.4 

Practical 

reconstruction 

(interpolation] 

pulse. 



CD player), we need to implement realizable signal reconstruction systems from the uniform 
signal samples. 

For practical implementation, this reconstruction pulse /?{/) must be easy to generate. For 
example, we may apply the reconstruction pulse p(t) as shown in Fig, 6.4. However, we must 
first use the nonideal interpolation pulse p(t) to analyze the accuracy of the reconstructed 
signal. Let us denote the new signal from reconstruction as 

g(0 - - nT s ) (6.12) 

M 


To determine its relation to the original analog signal g(t ), we can see from the properties of 
convolution and Eq.(6,l) that 


git) - ^2g(nT,)p(t - nT s ) = p(t ) * 


^2g(nT,)S(t - nT s ) 


n L n J 

— pit) *g(t) (6.13a) 

In the frequency domain, the relationship between the reconstruction and the original analog 
signal can rely on Eq, (6*4) 


Gif) = Pif^TGif - nf s ) (6.13b) 

I s 

fi 

This means that the reconstructed signal g(t) using pulse p(t) consists of multiple replicas of 
Gif) shifted to the frequency center nf-\ and filtered by P(f). To fully recover g<f) 7 further 
filtering of g(t) becomes necessaiy Such filters are often referred to as equalizers. 

Denote the equalizer transfer function as E( f). Distortionless reconstruction requires that 

Gif) = B(f)Gif) 

= E(f)P(f)^TG(f -nfs) 

2 <; 

n 

This relationship clearly illustrates that the equalizer must remove all the shifted replicas 
G(f — >{t] ) in the summation except for the low-pass term with n = 0, that is, 


£(/W) = 0 1/1 >fs — B 


(6.14a) 
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Figure 6.5 

Practical signal 
reconstruction. 



Figure 6.6 

Simple interpo¬ 
lation by means 
of simple 
rectangular 
pulses. 



Additionally, distortionless reconstruction requires that 

E(f)P(f) = T s [/| <5 (6.14b) 

The equalizer filter E(f ) must be low-pass in nature to stop all frequency content above 
fs — B Hz, and it should be the inverse of P(f) within the signal bandwidth of B Hz, Figure 6.5 
demonstrates the diagram of a practical signal reconstruction system utilizing such an equalizer. 
Let us now consider a very simple interpolating pulse generator that generates short 
(zero-order hold) pulses. As shown in Fig* 6,6, 

ft-Q.5T P \ 

pm = n {-^r) 

This is a gate pulse of unit height with pulse duration T p . The reconstruction will first generate 
^ ft — — 0.57Vj\ 

g(0 = X>™ n (-~- -) 

The transfer function of filter P(f) is the Fourier transform of W{t/T p ) shifted by 0.57^: 

P{f) = T p sine (itfT p ) (6.15) 

As a result, the equalizer frequency response should satisfy 


£(/) = 


T s /P(f) |/| < B 

Flexible B < \ f\ < (1 /T s -B) 

0 I/I > {1/T S -B) 
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It is important for us to ascertain that the equalizer pass band response is realizable. First 
of all, we can add another time delay to the reconstruction such that 

E(f ) = T s • . 7, |/| < B (6.16) 

sin (KfTp) 

For the passband gain of E(f) to be wed defined, it is imperative for us to choose a short 
pulse width T p such that 


sin (jifT p ) ^ n 


\f\<B 


This means that the equalizer E(f) does not need to achieve infinite gain. Otherwise the 
equalizer would become unrealizable. Equivalently, this requires that 


T p < \/B 

Hence, as long as the rectangular reconstruction pulse width is shorter than 1 jB y it may be 
possible to design an analog equalizer filter to recover the original analog signal g(t) from 
the nonideal reconstruction pulse train. Of course, this is a requirement for a rectangular 
reconstruction pulse generator. In practice, T p can be chosen very small, to yield the following 
equalizer passband response: 


E( f) = T x ■ 


sin (JzjT p ) 



I/I < B 


(6.17) 


This means that very little distortion remains when very short rectangular pulses are used in 
signal reconstruction. Such cases make the design of the equalizer either unnecessary or very 
simple. An illustrative example is given as a MATLAB exercise in Sec. 6.9. 

We can improve on the zero-order-hold filter by using the first-order-hold filter, which 
results in a linear interpolation instead of the staircase interpolation. The linear interpolator, 
whose impulse response is a triangle pulse A(f/2T0, results in an interpolation in which 
successive sample tops are connected by straight-line segments (Prob. 6*1-7). 


6.1.2 Practical Issues in Signal Sampling 
and Reconstruction 


Realizability of Reconstruction Filters _ 

If a signal is sampled at the Nyquist rate/* — 2 B Hz, the spectrum G(f ) consists of repetitions 
of G(f ) without any gap between successive cycles, as shown in Fig* 6.7a. To recover g(/} 
from g(f), we need to pass the sampled signal g(t) through an ideal low-pass filter (dotted 
area in Fig. 6.7a). As seen in Sec. 3.5, such a filter is unrealizable in practice; it can be closely 
approximated only with infinite time delay in the response. This means that we can recover 
the signal g(t) from its samples with infinite time delay* 

A practical solution to this problem is to sample the signal at a rate higher than the Nyquist 
rate (f x > 2 B or a> s > AizB). This yields G(/) ? consisting of repetitions of G(f ) with a finite 
band gap between successive cycles, as shown in Fig. 6.7b. We can now recover G(f ) from 
G(f) [orfromG(/)J by using a low-pass filter with a gradual cutoff characteristic (dotted area 
in Fig* 6.7b). But even in this case, the filter gain is required to be zero beyond the first cycle 
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of G(f ) {Fig. 6.7b). According to the Paley-Wiener criterion, it is impossibte to realize even 
this filter. The only advantage in this case is that the required filter can be better approximated 
with a smaller time delay. This shows that it is impossible in practice to recover a band-limited 
signal g(t) exactly from its samples, even if the sampling rate is higher than the Nyquist rate. 
However, as the sampling rate increases, the recovered signal approaches the desired signal 
more closely. 

The Treachery of Aliasing 

There is another fundamental practical difficulty in reconstructing a signal from its samples. 
The sampling theorem was proved on the assumption that the signal g{t) is band-limited. 
All practical signals are time-limited; that is, they are of finite duration or width. We can 
demonstrate (Prob. 6.1-8) that a signal cannot be time-limited and band-limited simultaneously. 
A time-limited signal cannot be band-limited, and vice versa (but a signal can be simultaneously 
non-time-limited and non-band-limited). Clearly, all practical signals, which are necessarily 
time-limited, jire non-band-limited, as shown in Fig. 6.8a; they have infinite bandwidth, and 
the spectrum G(f ) consists of overlapping cycles of G(f) repeating every f s Hz (the sampling 
frequency), as illustrated in Fig. 6.8b. Because of the infinite bandwidth in this case, the spectral 
overlap is unavoidable, regardless of the sampling rate. Sampling at a higher rate reduces but 
does not eliminate overlapping between repeating spectral cycles. Because of the overlapping 
tails, G{f ) no longer has complete information about G{/), and it is no longer possible, even 
theoretically, to recover g(t) exactly from the sampled signal g(t). If the sampled signal is 
passed through an ideal low-pass filter of cutoff frequency f s /2 Hz, the output is not G(f ) but 
G a (f) (Fig, 6.8c), which is a version of G{f) distorted as a result of two separate causes: 

1* The loss of the tail of G(f) beyond |/[ >/ ? /2Hz. 

2, The reappearance of this tail inverted or folded back onto the spectrum. 

Note that the spectra cross at frequency fJ2 = 1/2 T Hz, which is called tite folding 
frequency. The spectrum may be viewed as if the lost tail is folding backonto itself at the folding 
frequency. For instance, a component of frequency (f$/2)-\-f z shows up as, or “impersonates/ 1 
a component of lower frequency (f s /2) —f z in the reconstructed signal. Thus, the components of 
frequencies above f$/2 reappear as components of frequencies below .4/2. This tail inversion, 
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Figure 6*8 

Altering effect, 

(a) Spectrum of a 
practical signal 

(b) Spectrum of 
sampled #(r), 

(c) Reconstructed 
signal spectrum. 

(d) Sampling 
scheme using 
antialiasing filter. 

(e) Sampled 
signal spectrum 
(dotted) and the 
reconstructed 
signal spectrum 
(solid) when 
antialiasing filter 
is used. 



(e) 

Antialiasing filter 
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known as spectral folding or aliasing, is shown shaded in Fig. 6.8b and also in Fig. 6.8c. In 
the process of aliasing, not only are we losing all the components of frequencies above the 
folding frequency f s /2 Hz, but these very components reappear (aliased) as lower frequency 
components in Fig. 6.8b or c. Such aliasing destroys the integrity of the frequency components 
below the folding frequency/*/2, as depicted in Fig. 6.8c. 

The problem of aliasing is analogous to that of an army when a certain platoon has secretly 
defected to the enemy side but remains nominally loyal to their army. The army is in double 
jeopardy. First, it has lost the defecting platoon as an effective lighting force. In addition, during 
actual fighting, the army will have to contend with sabotage caused by the defectors and will 
have to use loyal platoon to neutralize the defectors. Thus, the army has lost two platoons to 
nonproductive activity. 

Defectors Eliminated: The Antialiasing Filter 

If you were the commander of the betrayed army, the solution to the problem would be obvious. 
As soon as you got wind of the defection, you would incapacitate, by whatever means, the 
defecting platoon. By taking this action before the fighting begins, you lose only one (the 
defecting)* platoon. This is a partial solution to the double jeopardy of betrayal and sabotage, 
a solution that partly rectifies the problem and cuts the losses in half. 

We follow exactly the same procedure. The potential defectors are all the frequency com¬ 
ponents beyond the folding frequency fi/2 = 1/27" Hz. We should eliminate (suppress) these 
components from g(t) before sampling g(t). Such suppression of higher frequencies can be 
accomplished by an ideal low-pass filter of cutoff/i/2 Hz, as shown in Fig. 6.8d. This is called 
the antialiasing filter. Figure 6.8d also shows that antialiasing filtering is performed before 
sampling. Figure 6.8e shows the sampled signal spectrum and the reconstructed signal G aa (f) 
when the antialiasing scheme is used. An antialiasing filter essentially band-limits the signal 
g(t) tof /2 Hz. This way, we lose only the components beyond the folding frequency/*/2 Hz. 
These suppressed components now cannot reappear, corrupting the components of frequencies 
below the folding frequency. Clearly, use of an antialiasing filter results in the reconstructed 
signal spectrum G aa (f) = G(f) for |/| <f s /2. Thus, although we lost the spectrum beyond 
f s /2 Hz, the spectrum for all the frequencies below fi/2 remains intact. The effective aliasing 
distortion is cut in half owing to elimination of folding. We stress again that the antialiasing 
operation must be performed before the signal is sampled. 

An antialiasing filter also helps to reduce noise. Noise, generally, has a wideband spectrum, 
and without antialiasing, the aliasing phenomenon itself will cause the noise components 
outside the desired signal band to appear in the signal band Antialiasing suppresses the entire 
noise spectrum beyond frequency f s /2. 

The antialiasing filter, being an ideal filter, is unrealizable. In practice we use a steep-cutoff 
filter, which leaves a sharply attenuated residual spectrum beyond the folding frequency/*/2. 

Sampling Forces Non-Band-Limited Signals to Appear Band-Limited 

Figure 6,8b shows the spectrum of a signal g{t) consists of overlapping cycles of G(f ). This 
means that g(j) are sub-Nyquist samples of g(t). However, we may also view the spectrum in 
Fig. 6.8b as the spectrum G a (f ) (Fig. 6.8c), repeating periodically every/* Hz without overlap. 
The spectrum G a (f) is band-limited tof s /2 Hz, Hence, these (sub-Nyquist) samples of g(0 


* Figure 6.8b shows that from the infinite number of repeating cycles, only the neighboring spectral cycles overlap. 
This is a somewhat simplified picture. In reality, all the cycles overlap and interact with every other cycle because 
of the infinite width of all practical signal spectra. Fortunately, all practical spectra also must decay at higher 
frequencies. This results in an insignificant amount of interference from cycles other than the immediate neighbors. 
When such an assumption is not justified, aliasing computations become little more involved. 
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are actually the Nyquist samples for signal g a (t). In conclusion, sampling a non-band-limited 
signal g(t) at a rate A Hz makes the samples appear to be the Nyquist samples of some signal 
ga(t), band-limited to/ P /2 Hz. In other words, sampling makes a non-band-limited signal 
appear to be a band-limited signal g a (t) with bandwidth f s /2 Hz* A similar conclusion applies 
if g(t) is band-limited but sampled at a sub-Nyquist rate* 

6.1.3 Maximum Information Rate: Two Pieces of 
Information per Second per Hertz 

A knowledge of the maximum rate at which information can be transmitted over a channel of 
bandwidth B Hz is of fundamental importance in digital communication. We now derive one 
of the basic relationships in communication, which states that a maximum oflB independent 
pieces of information per second can be transmitted, error free, over a noiseless channel of 
bandwidth B Hz, The result follows from the sampling theorem. 

First, the sampling theorem shows that a low-pass signal of bandwidth B Hz can be fully 
recovered from samples uniformly taken at the rate of 2 B samples per second. Conversely, 
we need to show that any sequence of independent data at the rate of 2 B Hz can come from 
uniform samples of a low-pass signal with bandwidth B * Moreover, we can construct this 
low-pass signal from the independent data sequence. 

Suppose a sequence of independent data samples is denoted as {g n }. Its rate is 2 B samples 
per second. Then there always exists a (not necessarily band-limited) signal #{f) such that 

gn = g(nT s ) T s = 

In Figure 6.9a we illustrate again the effect of sampling the non-band-limited signal g(r) at 
sampling rate/, = 2 B Hz. Because of aliasing, the ideal sampled signal 

g(t) = ^ g(nT s )S(t - nT s ) 

n 

= ^2ga(nT s )S(t - nT s ) 

n 

where g a (t) is the aliased low-pass signal whose samples g a (ttT s ) equal to the samples of 
g(nT s ). In other words, sub-Nyquist sampling of a signal g(t) generates samples that can 
be equally well obtained by Nyquist sampling of a band-limited signal g a (t)< Thus, through 
Figure 6*9, we demonstrate that sampling g(t) and g a (t) at the rate of 2 B Hz will generate the 
same independent information sequence {g^}: 

g n = g{nT s ) = g a (nT s ) T s = (6*18) 

Also from the sampling theorem, a low-pass signal g a (/) with bandwidth B can be reconstructed 
from its uniform samples [Eq. (6*10)] 

gait) = ^gn sine {2nBt - kn) 

n 

Assuming no noise, this signal can be transmitted over a distortionless channel of bandwidth 
B Hz, error free* At the receiver, the data sequence {#„} can be recovered from the Nyquist 
samples of the distortionless channel output g a (t) as the desired information data. 
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This theoretical rate of communication assumes a noise-free channel In practice, chan¬ 
nel noise is unavoidable, and consequently, this rate will cause some detection errors. In 
Chapter 14, we shall present the Shannon capacity which determines the theoretical error-free 
communication rate in the presence of noise. 


6.1.4 Nonideal Practical Sampling Analysis 

Thus far, we have mainly focused on ideal uniform sampling that can use an ideal impulse 
sampling pulse train to precisely extract the signal value g(kT^) at the precise instant of t = 
kT s . In practice, no physical device can carry out such a task. Consequently, we need to 
consider the more practical implementation of sampling. This analysis is important to the 
better understanding of errors that typically occur during practical A/D conversion and their 
effects on signal reconstruction. 

Practical samplers take each signal sample over a short time interval T p around r — kT^ 
In other words, every T s seconds, the sampling device takes a short snapshot of duration T p 
from the signal g(t) being sampled. This is just like taking a sequence of still photographs 
of a sprinter during an 100-meter Olympic race. Much like a regular camera that generates a 
still picture by averaging the picture scene over the window T p , the practical sampler would 
generate a sample value at t = kT s by averaging the values of signal g{f) over the window T p , 
that is, 


I [V 2 

gi(kT s ) = — / g(KT s + t)dt (6.19a) 

b J-t p ?2 

Depending on the actual device, this averaging may be weighted by a device-dependent 
averaging function q{t) such that 

Si (kT s ) = f q{t)g{kT s + t) dt 

b J—Tp/2 


( 6 . 1 %) 
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Figure 6.10 

Illustration of 

practical 

sampling. 




Thus we have used the camera analogy to establish that practical samplers in fact generate 
sampled signal of the form 


g(o = £giaL-m-tT 1 ) 


( 6 . 20 ) 


We will now show the relationship between the practically sampled signal g(0 and the original 
low-pass analog signal g(t) in the frequency domain. 

We will use Fig. 6.10 to illustrate the relationship between g(f) and #(/) for the special 
case of uniform weighting. This means that 


q{t) = 


1 

0 


|f[ < 0.57> 
in > 0.5T P 


As shown in Fig. 6.10, gi (f) can be equivalently obtained by first using “natural gating” to 
generate the signal snapshots 


g(t) = g(t) ■ q h (t) 


(6.21) 
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where 


<h s (t) = ^2 qtt-nT,) 

n=—oo 


Figure 6 + 10b illustrates the snapshot signal J{f). We can then define an averaging filter with 
impulse response 


MO = 



< r < 


IL 

2 


elsewhere 


or transfer function 


H a {f) = sine (xjT p ) 

Sending the naturally gated snapshot signal g(t) into the averaging filter generates the 
output signal 


Si(0 = M0*?(0 

As illustrated in Fig. 6.10c, the practical sampler generate a sampled signal g(0 by sampling 
the averaging filter output g\(kT s ). Thus we have used Fig. 6 + 10c to establish the equivalent 
process of taking snapshots, averaging, and sampling in generating practical samples of g(t). 
Now we can examine the frequency domain relationships to analyze the distortion generated 
by practical samplers. 

In the following analysis, we will consider a general weighting function q{t) whose only 
constraint is that 


<7(0=0, t g (-Q.5T P , 0.57)0 

To begin, note that q Tf (r) is periodic. Therefore, its Fourier series can be written as 

00 

?7,<o= J2 QneinW!! 


where 


Q, 


1 ro-sr, 

'«==-/ q(t)e Jn '° ! dt 
‘s J- 0.5Tn 


Thus, the averaging filter output signal is 

£l(0 = M0 * [g(/)lj ri {/)] 

■OO 

= M*>* J2 Qr>S(t)e> naiJ 


( 6 . 22 ) 
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In the frequency domain, we have 


Because 


G\(f) = Hif) J2 QnG(f-nf s ) 

H=—OG 

OG 

= sine (nfT p ) ^ Q n G(f - nf s ) 

n—-oo 

g(t) = J2^ ikr s )S( - t - kT ' ) 


we can apply the sampling theorem to show that 


G(f) = T J2 G '(f + ^ 


sine 


(2jt/ + m2nf 


2 -]S2" 

J n 

X^yr 2n sine [inf + (n + f)7r/,)7p]^ 


Gif + mf - nf s ) 
Gif + if) 


The last equality came from the change of the summation index t = m — n ♦ 
We can define frequency responses 


Ftif) = yr ]T] sinc [W + ( n + t)*fs)Tp\ 


This definition allows us to conveniently write 

G(/) = X>*(/)Ci(/+&) 


(6.23) 


(6.24) 


(6.25) 


For the low-pass signal G(f) with bandwidth B Hz, applying an ideal low-pass (interpolation) 
filter will generate a distorted signal 

W)C(f) (6.26a) 

in which 

Foil) = ^Y,Qn sine [it if + nf s )T p ] (6.26b) 

*S „ 


It can be seen from Eqs* (6.25) and (6.26) that the practically sampled signal already contains 
a known distortion Fq(J). 
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Moreover, the use of a practical reconstruction pulse p{t) as in Eq. (6J2) will generate 
additional distortion. Let us reconstruct g(t) by using the practical samples to generate 

s(0 = ^8i(nT s )p(t - nT ,) 
n 

Then from Eq t (6,13) we obtain the relationship between the spectra of the reconstruction and 
the original message G(f) as 

6(/) = P(f ) F n(f )G(f + nf s ) (6.27) 

n 

Since G(f) has bandwidth B Hz, we will need to design a new equalizer with transfer function 
E(f) such that the reconstruction is distortionless within the bandwidth B , that is. 


E(f)P(f)F 0 (f) = 


I l/l<* 

Flexible B < \f\ <f s - B 
0 1/1 >f*-B 


( 6 . 28 ) 


This single equalizer can be designed to compensate for two sources of distortion: nonideal 
sampling effect in F 0 (/> and nonideal reconstruction effect in P(f). The equalizer design is 
made practically possible because both distortions are known in advance. 


6.1.5 Some Applications of the Sampling Theorem 


The sampling theorem is very important in signal analysis, processing, and transmission 
because it allows us to replace a continuous time signal by a discrete sequence of numbers. 
Processing a continuous time signal is therefore equivalent to processing a discrete sequence of 
numbers. This leads us directly into the area of digital filtering. In the field of communication, 
the transmission of a continuous time message reduces to the transmission of a sequence of 
numbers. This opens doors to many new techniques of communicating continuous time sig¬ 
nals by pulse trains. The continuous time signal g(t) is sampled, and sample values are used to 
modify certain parameters of a periodic pulse train. We may vary the amplitudes (Fig. 6.1 lb), 
widths (Fig. 6.11c), or positions (Fig, 6.1 Id) of the pulses in proportion to the sample values of 
the signal g(t). Accordingly, we can have pulse amplitude modulation (PAM), pulse width 
modulation (PWM), or pulse position modulation (PPM). The most important form of pulse 
modulation today is pulse code modulation (PCM), introduced in Sec. L2. In all these cases, 
instead of transmitting g(r), we transmit the corresponding pulse-modulated signal. At the 
receiver, we read the information of the pulse-modulated signal and reconstruct the analog 
signal g(f). 

One advantage of using pulse modulation is that it permits the simultaneous transmission 
of several signals on a time-sharing basis—time division multiplexing (TDM), Because a 
pulse-modulated signal occupies only a part of the channel time, we can transmit several pulse- 
modulated signals on the same channel by interweaving them. Figure 6.12 shows the TDM 
of two PAM signals. In this manner we can multiplex several signals on the same channel by 
reducing pulse widths. 

Another method of transmitting several baseband signals simultaneously is frequency 
division multiplexing (FDM), briefly discussed in Chapter 4, in FDM, various signals are mul¬ 
tiplexed by sharing the channel bandwidth. The spectrum of each message is shifted to a specific 
band not occupied by any other signal. The information of various signals is located in nonover¬ 
lapping frequency bands of the channel. In a way, TDM and FDM are duals of each other. 
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Figure 6.11 

PuUe-modulated 
signals, (a) The 
unmodulated 
signal, (b) The 
PAM signal. 

(c) The PWM 
(RDM] signal. 

(d) The PPM 
signal. 




Figure 6.12 

Time division 
multiplexing of 
two signals. 



Figure 6.13 

PCM system 
diagram. 


6.2 PULSE CODE MODULATION (PCM) 

PCM is the most useful and widely used of all the pulse modulations mentioned. As shown in 
Fig. 6.13. PCM basically is a tool for converting an analog signal into a digital signal (A/D 
conversion). An analog signal is characterized by an amplitude that can take on any value over 
a continuous range. This means that it can take on an infinite number of values* On the other 
hand, digital signal amplitude can take on only a finite number of values* An analog signal can 
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Figure 6.14 

Quantization of 
a sampled 
analog signal. 



be converted into a digital signal by means of sampling and quantizing, that is, rounding off 
its value to one of the closest permissible numbers (or quantized levels), as shown in Fig. 6,14* 
The amplitudes of the analog signal m(t) lie in the range (— m p , m p ), which is partitioned intoL 
subintervals, each of magnitude Av = 2m p /L . Next, each sample amplitude is approximated 
by the midpoint value of the subinterval in which the sample fails (see Fig. 6.14 for L = 16). 
Each sample is now approximated to one of the L numbers. Thus, the signal is digitized, with 
quantized samples taking on any one of the L values. Such a signal is known as an £-ary 
digital signal. 

From practical viewpoint, a binary digital signal (a signal that can take on only two values) 
is very desirable because of its simplicity, economy, and ease of engineering. We can convert 
an L-ary signal into a binary signal by using pulse coding. Such a coding for the case of L = 16 
was shown in Fig. 1.5. This code, formed by binary representation of the 16 decimal digits 
from 0 to 15, is known as the natural binary code (NBC). Other possible ways of assigning 
a binary code will be discussed later. Each of the 16 levels to be transmitted is assigned one 
binary code of four digits. The analog signal m(t) is now converted to a (binary) digital signal. 
A binary digit is called a bit for convenience. This contraction of “binary digit” to “bit” has 
become an industry standard abbreviation and is used throughout the book. 

Thus, each sample in this example is encoded by four bits. To transmit this binary data, 
we need to assign a distinct pulse shape to each of the two bits. One possible way is to assign a 
negative pulse to a binary 0 and a positive pulse to a binary 1 (Fig. 1.5) so that each sample is 
now transmitted by a group of four binary pulses (pulse code). The resulting signal is a binary 
signal. 

The audio signal bandwidth is about 15 kHz. However, for speech, subjective tests show 
that signal articulation (intelligibility) is not affected if all the components above 3400 Hz 
are suppressed.*^ Since the objective in telephone communication is intelligibility rather than 
high fidelity, the components above 3400 Hz are eliminated by a low-pass filter. The resulting 
signal is then sampled at a rate of 8000 samples per second (8 kHz). This rate is intentionally 
kept higher than the Nyquist sampling rate of 6.8 kHz so that realizable filters can be applied 
for signal reconstruction. Each sample is finally quantized into 256 levels (L = 256), which 
requires a group of eight binary pulses to encode each sample (2 s = 256). Thus, a telephone 
signal requires 8 x 8000 = 64,000 binary pulses per second. 


* Components below 300 Hz may also be suppressed without affecting the articulation. 
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The compact disc (CD) is a more recent application ofPCM.Thisisahigh-fidelity situation 
requiring the audio signal bandwidth to be 20 kHz. Although the Nyquist sampling rate is only 
40 kHz, the actual sampling rate of 44.1 kHz is used for the reason mentioned earlier. The 
signal is quantized into a rather large number (L — 65,536) of quantization levels, each of 
which is represented by 16 bits to reduce the quantizing error. The binary-coded samples (1.4 
million bit/s) are then recorded on the compact disc. 


6.2.1 Advantages of Digital Communication 

Here are some of the advantages of digital communication over analog communication. 

L Digital communication, which can withstand channel noise and distortion much better 
than analog as long as the noise and the distortion are within limits, is more rugged than analog 
communication. With analog messages, on the other hand, any distortion or noise, no matter 
how small, will distort the received signal. 

2. The greatest advantage of digital communication over analog communication, how¬ 
ever, is the viability of regenerative repeaters in the former. In an analog communication system, 
a message signal becomes progressively weaker as it travels along the channel, whereas the 
cumulative channel noise and the signal distortion grow progressively stronger. Ultimately 
the signal is overwhelmed by noise and distortion. Amplification offers little help because it 
enhances the signal and the noise by the same proportion. Consequently, the distance over 
which an analog message can be transmitted is limited by the initial transmission power. For 
digital communications, a long transmission path may also lead to overwhelming noise and 
interferences. The trick, however, is to set up repeater stations along the transmission path at 
distances short enough to be able to detect signal pulses before the noise and distortion have 
a chance to accumulate sufficiently. At each repeater station the pulses are detected, and new, 
clean pulses are transmitted to the next repeater station, which, in turn, duplicates the same pro¬ 
cess. If the noise and distortion are within limits (which is possible because of the closely spaced 
repeaters), pulses can be detected correctly.* This way the digital messages can be transmitted 
over longer distances with greater reliability. The most significant error in PCM comes from 
quantizing. This error can be reduced as much as desired by increasing the number of quan¬ 
tizing levels, the price of which is paid in an increased bandwidth of the transmission medium 
(channel). 

3. Digital hardware implementation is flexible and permits the use of microprocessors, 
digital switching, and large-scale integrated circuits. 

4. Digital signals can be coded to yield extremely low error rates and high fidelity as well 
as for privacy. 

5. Tt is easier and more efficient to multiplex several digital signals. 

6. Digital communication is inherently more efficient than analog in exchanging SNR for 
bandwidth. 

7. Digital signal storage is relatively easy and inexpensive. It also has the ability to search 
and select information from distant electronic database. 

8. Reproduction with digital messages can be extremely reliable without deterioration. 
Analog messages such as photocopies and films, for example, lose quality at each successive 
stage of reproduction and must be transported physically from one distant place to another, 
often at relatively high cost. 


+ The error in pulse detection can be made negligible. 
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9. The cost of digital hardware continues to halve every two or three years, while 
performance or capacity doubles over the same time period. And there is no end in sight 
yet to this breathtaking and relentless exponential progress in digital technology. As a 
result, digital technologies today dominate in any given area of communication or storage 
technologies. 


A Historical Note 

The ancient Indian writer Pingala applied what turns out to be advanced mathematical concepts 
for describing prosody, and in doing so presented the firstknown description of a binary numeral 
system, possibly as early as the eighth century BCE. 6 Others, like R. Hall in Mathematics of 
Poetry place him later, circa 200 BCE. Gottfried Wilhelm Leibniz (1646-1716) was the first 
mathematician in the West to work out systematically the binary representation (using Is and Os) 
for any number. He felt a spiritual significance in this discovery, believing that 1, representing 
unity, was clearly a symbol for God, while 0 represented nothingness. He reasoned that if all 
numbers can be represented merely by the use of 1 and 0, this surely proves that God created 
the universe out of nothing! 


6.2.2 Quantizing 

As mentioned earlier, digital signals come from a variety of sources. Some sources such as 
computers are inherently digital. Some sources are analog, but are converted into digital form 
by a variety of techniques such as PCM and delta modulation (DM), which will now be 
analyzed. The rest of this section provides quantitative discussion of PCM and its various 
aspects, such as quantizing, encoding, synchronizing, the required transmission bandwidth 
and SNR. 

For quantization, we Umitthe amplitude of the message signal m(t) to the range (—m p , m p ), 
as shown in Fig. 6.14. Note that m p is not necessarily the peakamplitude ofm(r).Theamplitudes 
of m(t) beyond ±m p are simply chopped off. Thus, m p is not a parameter of the signal m{t)\ 
rather, it is the limit of the quantizer. The amplitude range {—m p , m p ) is divided into Luniformly 
spaced intervals, each of width Av = 2 m p /L A sample value is approximated by the midpoint 
of the interval in which it lies (Fig. 6.14), The quantized samples are coded and transmitted 
as binary pulses. At the receiver some pulses may be detected incorrectly. Hence, there are 
two sources of error in this scheme: quantization error and pulse detection error. In almost all 
practical schemes, the pulse detection error is quite small compared to the quantization error 
and can be ignored. In the present analysis, therefore, we shall assume that the error in the 
received signal is caused exclusively by quantization. 

If m(kT s ) is the fcth sample of the signal m{t ), and if m(kT$) is the corresponding quantized 
sample, then from the interpolation formula in Eq. (6.10), 


m{t) = ^ m{kT s ) sine {2nBi — kn) 
k 


and 


mit) — m(kT s ) sine (2xBt — kn) 
k 
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where m(f) is the signal reconstructed from quantized samples. The distortion component q(t) 
in the reconstructed signal is q{t) = m{t) — m(t). Thus, 

q(t) = - m(£T ? )] sine (2 jT Bt — kit) 

k 

— ^ q(kT s ) sine (2nBt — kir) 
k 

where q{kl *) is the quantization error in the £th sample. The signal q{t ) is the undesired signal, 
and, hence, acts as noise, known as quantization noise. To calculate the power, or the mean 
square value of q(t) r we have 


q (t) = lim 

T^oo 


1 f 7/2 


U 


T/2 


q (t) dt 


— lim 

T^oc T 


1 r T / 2 f 

f f-r/2 ^ 


t2 


q(kTf) sine ( 2nBt — kit) 


dt 


(6.29a) 


We can show that (see Prob. 3.7-4) the signals sine (2 nBt — mn) and sine (2irBt — nn) are 
orthogonal, that is, 


L 


DO 

-CO 


sine (2 TiBt — rrm) sine (2jTBt — rm) dt = 


0 

J_ 
2 B 


m ^ n 
m = n 


(6,29b) 


Because of this result, the integrals of the cross-product terms on the right-hand side of 
Eq. (6.29a) vanish, and we obtain 


*0 = ton i 

T-^ooT J-T/2^ 


2 (kT s ) sine 2 (InBt — kit) dt 


1 f T ' 2 

= lim -r q 2 (kT s ) I sine 2 (2jzBt — kn) dt 
r-j-co T " J-jn 


From the orthogonality relationship (6.29b), it follows that 


r(0= 


lim r—- 
r-co 2 BT 




(6.30) 


Because the sampling rate is 2B, the total number of samples over the averaging interval T is 
2 BT. Hence, the right-hand side of Eq. (6.30) represents the average, or the mean of the square 
of the quantization error. The quantum levels are separated by Av = 2 m p /L. Since a sample 
value is approximated by the midpoint of the subinterval (of height Av) in which the sample 
falls, the maximum quantization error is ± Av/2. Thus, the quantization error lies in the range 
(-Av/2, Av/2), where 



(6,31) 
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Assuming that the error is equally likely to lie anywhere in the range (-Av/2, Av/2), the 
mean square quantizing error q 2 is given by* 

Av/2 

q 2 dq 

Av/2 

= 

12 

_ Al 

“ 3 L 2 

Because q 2 (f) is the mean square value or power of the quantization noise, we shall denote it 
by N q , 

T" ml 

Assuming that the pulse detection error at the receiver is negligible, the reconstructed signal 
m(t) at the receiver output is 



(6.32) 

(6.33) 


m(t) = m(t) H- q{l) 

The desired signal at the output is m(r), and the (quantization) noise is q(t). Since the power 
of the message signal m(f) is m 2 (f), then 


and 


5, 

No 


= m 2 (i) 



_ 3^2 


(6.34) 


In this equation, m p is the peak amplitude value that a quantizer can accept, and is therefore 
a parameter of the quantizer. This means S 0 fN 0 , the SNR, is a linear function of the message 

signal power m 2 (t) (see Fig, 6.18 with \l — 0). 


* Those who are familiar with the theory of probability can derive this result directly by noting that the probability 
density of the quantization error q is l/(2mp/L) = L/2m p over the range \q\ < ttip/L and is zero elsewhere. Hence, 


/ m p /L fmpfL 

q 2 p(q)dq= f 

-itip/L J-m p /L 


L 2, 

^—q d <f ■■ 

ip/i. 2m p 


fftp 

3 7? 
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6.2.3 Principle of Progressive Taxation: Nonuniform 
Quantization 


Recall that S 0 fN 0 , the SNR, is an indication of the quality of the received signal* Ideally we 
would like to have a constant SNR (the same quality) for all values of the message signal power 

m 2 (f)< Unfortunately, the SNR is directly proportional to the signal power m 2 (t ), which varies 
from speaker to speaker by as much as 40 dB (a power ratio of 10 4 )* The signal power can also 
vary because of the different lengths of the connecting circuits. This indicates that the SNR in 
Eq. (6.34) can vary widely, depending on the speaker and the length of the circuit. Even for 
the same speaker, the quality of the received signal will deteriorate markedly when the person 
speaks softly. Statistically, it is found that smaller amplitudes predominate in speech and larger 
amplitudes are much less frequent* This means the SNR will be low most of the time. 

The root of this difficulty lies in the fact that the quantizing steps are of uniform value 
Av = 2 rripjL, The quantization noise N q = (Av) 2 /12 [Eq. (6*32)] is directly proportional 
to the square of the step size. The problem can be solved by using smaller steps for smaller 
amplitudes (nonuniform quantizing), as shown in Fig* 6,15a. The same result is obtained by 
first compressing signal samples and then using a uniform quantization* The input-output 
characteristics of a compressor are shown in Fig* 6.15b* The horizontal axis is the normalized 
input signal (i.e., the input signal amplitude m divided by the signal peak value m {} ). The 
vertical axis is the output signal y. The compressor maps input signal increments Am into 
larger increments Ay for small input signals, and vice versa for large input signals* Hence, a 
given interval Am contains a larger number of steps (or smaller step size) when m is small. 
The quantization noise is lower for smaller input signal power* An approximately logarithmic 
compression characteristic yields a quantization noise nearly proportional to the signal power 

thus making the SNR practically independent of the input signal power over a large 
dynamic ranged (see later Fig, 6.18). This approach of equalizing the SNR appears similar to 
the use of progressive income tax to equalize incomes. The loud talkers and stronger signals 
are penalized with higher noise steps Av to compensate the soft talkers and weaker signals. 

Among several choices, two compression laws have been accepted as desirable standards 
by the ITU-T: 6 the ju-law used in North America and Japan, and the A-law used in Europe and 
the rest of the world and on international routes. Both the /i-law and the A4aw curves have 
odd symmetry about the vertical axis. The /r-law (for positive amplitudes) is given by 


1 


y = 


In (1 + /i) 
The A-law (for positive amplitudes) is 

A 

F 

y = 


In 


(l + ^) 

V ) 


1 + In A 




0 < — < 1 
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0 < — < - 
m p A 


1 m 

i < — < 1 

a m p 


(6*35a) 


(6.35b) 


These characteristics are shown in Fig. 6*16* 

The compression parameter /x (or A) determines the degree of compression. To obtain a 
nearly constant S o fN 0 over a dynamic range of for input signal power 40 dB, should be 
greater than 100* Early North American channel banks and other digital terminals used a value 
of \l = 100, which yielded the best results for 7-bit (128-level) encoding. An optimum value 
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Figure 6,15 m p - 

Nonuniform 

quantization. ,_ 



(a) 



(b) 


of \jl = 255 has been used for all North American 8-bit (256-level) digital terminals, and the 
earlier value of fi is now almost extinct. For theA-law, a value of A = 87.6 gives comparable 
results and has been standardized by the ITU-T. 6 

The compressed samples must be restored to their original values at the receiver by using 
an expander with a characteristic complementary to that of the compressor. The compressor and 
the expander together are called the compandor. Figure 6.17 describes the use of compressor 
and expander along with a uniform quantizer to achieve nonuniform quantization. 

Generally speaking, time compression of a signal increases its bandwidth. But in PCM, 
we are compressing not the signal m{t) in time but its sample values. Because neither the time 
scale not the number of samples changes, the problem of bandwidth increase does not arise 
here. It happens that when a \x -law compandor is used, the output SNR is 


S 0 _ 3 L 2 

K - [In (1 + M)] 2 


jJ . 2 » 



m 2 (t ) 


(6.36) 
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Figure 6.16 

(a) //-Law 
characteristic. 

(b) ^ Law 
characteristic 



(a) 




(b) 


Figure 6.17 

Utilization of 
compressor and 
expander for 
nonuniform 
quantization. 



The output SNR for the cases of — 255 and /x =0 (uniform quantization) as a function of 
m 2 (t) (the message signal power) is shown in Fig. 6.18. 

The Compandor 

A logarithmic compressor can be realized by a semiconductor diode, because the V-I 
characteristic of such a diode is of the desired form in the first quadrant: 



Two matched diodes in parallel with opposite polarity provide the approximate characteristic 
in the first and third quadrants (ignoring the saturation current)* In practice, adjustable resistors 
are placed in series with each diode and a third variable resistor is added in parallel. By adjusting 
various resistors, the resulting characteristic is made to fit a finite number of points (usually 
seven) on the ideal characteristics* 

An alternative approach is to use a piecewise linear approximation to the logarithmic char- 
acteristics* A 15-segmented approximation (Fig* 6.19) to the eighth bit (L = 256) with/x = 255 
law i s widely used in the D2 channel bank that is used in conj unction with the T1 carrier system. 
The segmented approximation is only marginally inferior in terms of SNR. 8 The piecewise 
linear approximation has almost universally replaced earlier logarithmic approximations to 
the true \x = 255 characteristic and is the method of choice in North American standards* 
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Figure 6.18 

Ratio of signal to 
quantization 
noise in PCM 
with and without 
compression. 





Figure 6.19 

Piecewise linear 
com pressor 
characteristic. 


Output 



Though a true \jl — 255 compressor working with a \jl = 255 expander will be superior to sim¬ 
ilar piecewise linear devices, a digital terminal device exhibiting the true characteristic in 
today's network must work end-to-end against other network elements that use the piecewise 
linear approximation. Such a combination of differing characteristics is inferior to either of 
the characteristics obtained when the compressor and the expander operate using the same 
compression law. 

In the standard audio file format used by Sun, Unix and Java, the audio in “au” 
files can be pulse-code-modulated or compressed with the ITU-T G.711 standard through 
either the //-law or the A-law. 6 The //-law compressor (// = 255) converts 14-bit 
signed linear PCM samples to logarithmic 8-bit samples, leading to storage saving. The 
A-law compressor (A = 87.6) converts 13-bit signed linear PCM samples to logarithmic 
8-bit samples. In both cases, sampling at the rate of 8000 Hz, a GL77 encoder thus creates from 
audio signals bit streams at 64 kilobits per second (kbit/s). Since the A-law and the //-law are 
mutually compatible, audio recoded into “au” files can be decoded in either format. It should 
be noted that the Microsoft WAV audio format also has compression options that use //-law 
and A-law. 
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The PCM Encoder 

The multiplexed PAM output is applied at the input of the encoder, which quantizes and encodes 
each sample into a group of n binary digits. A variety of encoders is available. 7 * 10 We shall 
discuss here the digit-at-a-time encoder, which makes n sequential comparisons to generate an 
h- hit codeword. The sample is compared with a voltage obtained by a combination of reference 
voltages proportional to 2 7 , 2 6 , 2 5 ,,.., 2°. The reference voltages are conveniently generated 
by a bank of resistors R , 2 R, 2 2 tf, ..., 2 7 R. 

The encoding involves answering successive questions, beginning with whether the sam¬ 
ple is in the upper or lower half of the allowed range. The first code digit 1 or 0 is generated, 
depending on whether the sample is in the upper or the lower half of the range. In the second 
step, another digit 1 or 0 is generated, depending on whether the sample is in the upper or the 
lower half of the subinterval in which it has been located. This process continues until the last 
binary digit in the code has been generated. 

Decoding is the inverse of encoding. In this case, each of the n digits is applied to aresistor 
of different value. The £th digit is applied to a resistor 2 k R. The currents in all the resistors 
are added. The sum is proportional to the quantized sample value. For example, a binary code 
word 10010110 will give a current proportional to 2 7 + 0 + 0 + 2 4 4- 0 + 2 2 + 2 1 + 0 = 150. 
This completes the D/A conversion. 

6.2.4 Transmission Bandwidth and the Output SNR 

For a hi nary PCM, we assign a distinct group of« binary digits (bits) to each of the L quantization 
levels. Because a sequence of n binary digits can be arranged in 2" distinct patterns, 

L — 2 n or n — log 2 L (6.37) 

each quantized sample is, thus, encoded into n bits. Because a signal m(t) band-limited to B 
Hz requires a minimum of IB samples per second, we require a total of 2nB bit/s, that is, 2 nB 
pieces of information per second. Because a unit bandwidth (1 Hz) can transmit a maximum of 
two pieces of information per second (Sec. 6.1.3), we require a minimum channel of bandwidth 
Bt Hz, given by 


B r = nB Hz (6.38) 

This is the theoretical minimum transmission bandwidth required to transmit the PCM signal. 
In Secs. 7.2 and 7.3, we shall see that for practical reasons we may use a transmission bandwidth 
higher than this minimum. 


Example 6.2 A signal m(t) band-limited to 3 kHz is sampled at a rate 33^% higher than the Nyquist rate. 

The maximum acceptable error in the sample amplitude (the maximum quantization error) is 
0.5% of the peak amplitude m p . The quantized samples are binary coded. Find the minimum 
bandwidth of a channel required to transmit the encoded binary signal. If 24 such signals 
are time-division-multiplexed, determine the minimum transmission bandwidth required to 
transmit the multiplexed signal. 

%. The Nyquist sampling rate is Rn = 2 x 3000 = 6000 Hz (samples per second). The 
P actual sampling rate is Ra = 6000 x (t±) = 8000 Hz. 

4 The quantization step is Av, and the maximum quantization error is ±Av/2. 
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I 

I 


Therefore, from Eq. (6.31), 


Av _ nip _ 0.5 

~T = T = Too m,J 


L = 200 


For binary coding, L must be a power of 2. Hence, the next higher value of L that is a 
power of 2 is L — 256. 

From Eq. (6.37), we need n = log 2 256 = 8 bits per sample* We require to transmit 
a total of C = 8 x 8000 = 64,000 bit/s. Because we can transmit up to 2 bit/s per hertz 
of bandwidth, we require a minimum transmission bandwidth B r — C/2 = 32 kHz. 

The multiplexed signal has a total of Cm = 24 x 64,000 = 1.536 Mbit/s, which 
requires a minimum of 1.536/2 = 0.768 MHz of transmission bandwidth* 


Exponential Increase of the Output SNR 

From Eq. (6.37), Lr — , and the output SNR in Eq. (6.34) or Eq. (6.36) can be expressed as 


— - c(2) ln 
No 


(6.39) 


where 

^ [uncompressed case, in Eq* (6.34)] 

c = • "V 

---y [compressed ease, in Eq. (6*36)] 

[In (1 4- m)] 2 P 4 

Substitution of Eq* (6.38) into Eq. (6.39) yields 

— = c ( 2 ) 2Bt/b (6.40) 

No 

From Eq* (6.40) we observe that the SNR increases exponentially with the transmission band¬ 
width Bt -This trade of SNR for bandwidth is attractive and comes close to the upper theoretical 
limit* A small increase in bandwidth yields a large benefit in terms of SNR. This relationship 
is clearly seen by using the decibel scale to rewrite Eq. (6.39) as 

= 101og 10 U’(2) 2n ] 

= 10 log 10 c -\-2n log i o 2 

= <t* + 6 n) dB (6*41) 


where a — 10 log 10 c* This shows that increasing n by 1 (increasing one bit in the codeword) 
quadruples the output SNR (a 6 dB increase). Thus, if we increase n from 8 to 9, the SNR 
quadruples, but the transmission bandwidth increases only from 32 kHz to 36 kHz (an increase 
of only 12*5%)* This shows that in PCM, SNR can be controlled by transmission bandwidth. 
We shall see later that frequency and phase modulation also do this. But it requires a doubling of 
the bandwidth to quadruple the SNR* In this respect, PCM is strikingly superior to FM or PM* 
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Example 6.3 A signal m(t) of bandwidth B — 4 kHz is transmitted using a binary companded PCM with 
/x — 100, Compare the case of L = 64 with the case of L = 256 from the point of view of 
transmission bandwidth and the output SNR. 


Ft>r L = 64, n — 6, and the transmission bandwidth is nB = 24 kHz, 


$ 

a 


1 


Hence, 


Sp 


(a + 36) dB 


10 log 


3 

[ 1 "( 101)] 2 


-8.51 


S 0 

— = 27.49 dB 
No 

For L = 256, n — 8, and the transmission bandwidth is 32 kHz, 

-f = or + 6n = 39.49 dB 
N 0 

The difference between the two SNRs is 12 dB, which is a ratio of 16, Thus, the SNR 
for L = 256 is 16 times the SNR for L = 64. The former requires just about 33% more 
bandwidth compared to the latter. 


Comments on Logarithmic Units 

Logarithmic units and logarithmic scales are very convenient when a variable has a large 
dynamic range. Such is the case with frequency variables or SNRs, A logarithmic unit for the 
power ratio is the decibel (dB), defined as 10 log 10 (power ratio). Thus, an SNR isjr dB, where 

£ 

A = 10 lo gio jy 

We use the same unit to express power gain or loss over a certain transmission medium. For 
instance, if over a certain cable the signal power is attenuated by a factor of 15, the cable gain is 

G = 10 log ]0 ^ = -11.76 dB 


or the cable attenuation (loss) is 11.76 dB. 

Although the decibel is a measure of power ratios, it is often used as a measure of power 
itself. For instance, “100 watt” may be considered to be a power ratio of UK) with respect to 
1-watt power, and is expressed in units of dBW as 


P dBW = 10 log 10 100 = 20 dBW 
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Thus, 100-watt power is 20 dBW. Similarly, power measured with respect to 1 mW power is 
dBm, For instance, 100-watt power is 


PdBm = 10 log - --- = 50 dBm 

1 mW 


6.3 DIGITAL TELEPHONY: PCM IN T1 
CARRIER SYSTEMS 

A Historical Note 

Because of the unavailability of suitable switching devices, more than 20 years elapsed between 
the invention of PCM and its implementation. Vacuum tubes, used before the invention of the 
transistor, were not only bulky, but they were poor switches and dissipated a lot of heat. Systems 
having vacuum tubes as switches were large, rather unreliable, and tended to overheat. PCM 
was just waiting for the invention of the transistor, which happens to be a small device that 
consumes little power and is a nearly ideal switch. 

Coincidentally, at about the time the transistor was invented, the demand for telephone 
service had become so heavy that the existing system was overloaded, particularly in large 
cities. It was not easy to install new underground cables because space available under the 
streets in many cities was already occupied by other services (water, gas, sewer, etc.). Moreover, 
digging up streets and causing many dislocations was not very attractive. An attempt was made 
on a limited scale to increase the capacity by frequency-division-multiplexing several voice 
channels through amplitude modulation. Unfortunately, the cables were primarily designed 
for the audio voice range (0-4 kHz) and suffered severely from noise. Furthermore, cross talk 
between pairs of channels on the same cable was unacceptable at high frequencies. Ironically, 
PCM—requiring a bandwidth several times larger than that required for FDM signals—offered 
the solution. This is because digital systems with closely spaced regenerative repeaters can 
work satisfactorily on noisy lines that give poor high-frequency performance. 9 The repeaters, 
spaced approximately 6000 feel apart, clean up the signal and regenerate new pulses before the 
pulses get too distorted and noisy. This is the history of the Bell System's T1 carrier system.^ 10 
A pair of wires that used to transmit one audio signal of bandwidth 4 kHz is now used to transmit 
24 time-division-multiplexed PCM telephone signals with a total bandwidth of 1.544 MHz. 


T1 Time Division Multiplexing 

A schematic of a T1 carrier system is shown in Fig. 6.20a. All 24 channels are sampled 
in a sequence. The sampler output represents a time-division-multiplexed PAM signal. The 
multiplexed PAM signal is now applied to the input of an encoder that quantizes each sample 
and encodes it into eight binary pulses—a binary codeword 5 * (see Fig. 6.20b). The signal, 
now converted to digital form, is sent over the transmission medium. Regenerative repeaters 
spaced approximately 6000 feet apart detect the pulses and retransmit new pulses. At the 
receiver, the decoder converts the binary pulses into samples (decoding). The samples are 
then demultiplexed (i.e., distributed to each of the 24 channels). The desired audio signal is 
reconstructed by passing the samples through a low-pass filter in each channel. 


In an earlier version, each sample was encoded by seven bits. An additional bit was added for signaling. 
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Figure 6.20 

T1 carrier 
system. 


Channel 




(b) 


The commutators in Fig, 6.20 are not mechanical but are high-speed electronic switching 
circuits* Several schemes are available for this purpose* 11 Sampling is done by electronic gates 
(such as a bridge diode circuit, as shown in Fig. 4.5a) opened periodically by narrow pulses of 
2 jas duration* The 1.544 Mbit/s signal of the T1 system, called digital signal level 1 (DS1), 
is used further to multiplex into progressively higher level signals DS2, DS3, and DS4, as 
described next, in Sec. 6,4 

After the Bell System introduced the T1 carrier system in the United States, dozens of 
variations were proposed or adopted elsewhere before the ITU-T standardized its 30-channel 
PCM system with a rate of 2,048 Mbit/s (in contrast toTl, with 24 channels and 1 *544 Mbit/s). 
The 30-channel system is used all over the world, except in North America and Japan* Because 
of the widespread adoption of the T1 carrier system in the United States and Japan before the 
ITU-T standardization, the two standards continue to be used in different parts of the world, 
with appropriate interfaces in international connections. 
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Figure 6.21 

T1 system 
signaling format. 



bits bits bits 


Synchronizing and Signaling 

Binary codewords corresponding to samples of each of the 24 channels are multiplexed in 
a sequence, as shown in Fig. 6*2 h A segment containing one codeword (corresponding to 
one sample) from each of the 24 channels is called a frame. Each frame has 24 x 8 = 192 
information bits. Because the sampling rate is 8000 samples per second, each frame takes 
125 jas. To separate information bits correctly at the receiver, it is necessary to be sure where 
each frame begins. Therefore, a framing bit is added at the beginning of each frame. This 
makes a total of 193 bits per frame. Framing bits are chosen so that a sequence of framing bits, 
one at the beginning of each frame, forms a special pattern that is unlikely to be formed in a 
speech signal. 

The sequence formed by the first bit from each frame is examined by the logic of the 
receiving terminal. If this sequence does not follow the given code pattern (framing bit pattern), 
a synchronization loss is detected, and the next position is examined to determine whether it 
is actually the framing bit it takes about 0.4 to 6 ms to detect and about 50 ms (in the worst 
possible case) to reframe. 

In addition to information and framing bits, we need to transmit signaling bits corre¬ 
sponding to dialing pulses, as well as telephone on-hook/off-hook signals. When channels 
developed by this system are used to transmit signals between telephone switching systems, 
the switches must be able to communicate with each other to use the channels effectively. 
Since all eight bits are now used for transmission instead of the seven bits used in the earlier 
version,* the signaling channel provided by the eighth bit is no longer available. Since only a 
rather low-speed signaling channel is required, rather than create extra time slots for this infor¬ 
mation, we use one information bit (the least significant bit) of every sixth sample of a signal 


* In the earlier version of T1, quantizing levels L = 128 required only seven information bits. The eighth bit was 
used for signaling. 
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to transmit this information. This means that every sixth sample of each voice signal will have 
a possible error corresponding to the least significant digit. Evet^ sixth frame, therefore, has 
7 x 24 = 168 information bits, 24 signaling bits, and 1 framing bit. In all the remaining frames, 
there are 192 information bits and 1 framing bit. This technique is called 7f bit encoding, and 
the signaling channel so derived is called robbed-bit signaling. The slight SNR degradation 
suffered by impairing one out of six frames is considered to be an acceptable penalty. The sig¬ 
naling bits for each signal occur at a rate of 8000/6 = 1333 bit/s. The frame format is shown 
in Fig* 6*21. 

The older seven-bit framing format required only that frame boundaries be identified 
so that the channels could be located in the bit stream. When signaling is superimposed on 
the channels in every sixth frame, it is necessary to identify, at the receiver, which frames 
are the signaling frames* A new framing structure, called the superframe, was developed 
to take care of this. The framing bits are transmitted at 8 kbit/s as before and occupy the 
first bit of each frame. The framing bits form a special pattern, which repeats in 12 frames; 
100011011100, The pattern thus allows the identification of frame boundaries as before, but 
also allows the determination of the locations of the sixth and twelfth frames within the super- 
frame. Note that the superframe described here is 12 frames in length. Since two bits per 
superframe are available for signaling for each channel, it is possible to provide four-state 
signaling for a channel by using the four possible patterns of the two signaling bits: 00 , 01 , 
10, and 11. Although most switch-to-switch applications in the telephone network require 
only two-state signaling, three- and four-state signaling techniques are used in certain special 
applications. 

Advances in digital electronics and in coding theory have made it unnecessary to use 
the full 8 kbit/s of the framing channel in a DS1 signal to perform the framing task* A new 
superframe structure, called the extended superframe (ESF) format, was introduced during 
the 1970s to take advantage of the reduced framing bandwidth requirement* An ESF is 24 
frames in length and carries signaling bits in the eighth bit of each channel in frames 6, 12,18, 
and 24* Sixteen-state signaling is thus possible and is sometimes used although, as with the 
superframe format, most applications require only two-state signaling* 

The 8 kbit/s overhead (framing) capacity of the ESF signal is divided into three channels; 2 
kbit/s for framing, 2 kbit/s for a cyclic redundancy check (CRC-6) error detection channel, and 
4 kbit/s for a data channel* The highly reliable error checking provided by the CRC-6 pattern 
and the use of the data channel to transport information on signal performance as received 
by the distant terminal make ESF much more attractive to service providers than the older 
superframe format* More discussions on CRC error detection can be found in Chapter 14* 

The 2 kbit/s framing channel of the ESF format carries the repetitive pattern 001011 *.., a 
pattern that repeats in 24 frames and is much less vulnerable to counterfeiting than the patterns 
associated with the earlier formats. 

For various reasons, including the development of intelligent network-switching nodes, 
the function of signaling is being transferred out from the channels that carry the messages 
or data signals to separate signaling networks called common channel interoffice signaling 
(COS) systems. The universal deployment of such systems will significantly decrease the 
importance of robbed-bit signaling, and all eight bits of each message (or sample) will he 
transmitted in most applications. 

The Conference on European Postal and Telegraph Administration (CEPT) has standard¬ 
ized a PCM with 256 time slots per frame* Each frame has 30 x 8 = 240 information bits, 
corresponding to 30 speech channels (with eight bits each). The remaining 16 bits per frame 
are used for frame synchronization and signaling. Therefore, although the bit rate is 2.048 
Mbit/s, corresponding to 32 voice channels, only 30 voice channels are transmitted. 
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6.4 DIGITAL MULTIPLEXING 

Several low-bit-rate signals can be multiplexed, or combined, to form one high-bit-rate signal, 
to be transmitted over a high-frequency medium. Because the medium is time-shared by various 
incoming signals, this is a case of TDM (time division multiplexing). The signals from various 
incoming channels, or tributaries, may be as diverse as a digitized voice signal (PCM), a 
computer output, telemetry data, and a digital facsimile. The bit rates of various tributaries 
need not be the same. 

To begin with, consider the case of all tributaries with identical bit rates. Multiplexing can 
be done on a bit-by-bit basis (known as bit or digit interleaving) as shown in Fig. 6,22a, or on a 
word-by-word basis (known as byte or word interleaving). Figure 6.22b shows the interleaving 
of words, formed by four bits. The North American digital hierarchy uses bit interleaving 
(except at the lowest level), where bits are taken one at a time from the various signals to be 
multiplexed. Byte interleaving, used in building the DS1 signal and SONET-formatted signals, 
involves inserting bytes in succession from the channels to be multiplexed. 

The T1 carrier, discussed in Sec. 6.3, uses eight-bit word interleaving. When the bit rates 
of incoming channels are not identical, the high-bit-rate channel is allocated proportionately 
more slots. Four-channel multiplexing consists of three channels B, C, and D of identical bit 
rate R and one channel (channel A) with a bit rate of 3R . (Fig. 6.22e,d). Similar results can be 
attained by combining words of different lengths. It is evident that the minimum length of the 
multiplex frame must be a multiple of the lowest common multiple of the incoming channel 
bit rates, and, hence, this type of scheme is practical only when some fairly simple relation¬ 
ship exists among these rates. The case of completely asynchronous channels is discussed 
later. 

At the receiving terminal, the incoming digit stream must be divided and distributed to the 
appropriate output channel. For this purpose, the receiving terminal must be able to correctly 
identify each bit. This requires the receiving system to uniquely synchronize in time with the 
beginning of each frame, with each slot in a frame, and with each bit within a slot. This is 
accomplished by adding framing and synchronization bits to the data bits. These bits are part 
of the so-called overhead bits* 


6.4.1 Signal Format 

Figure 6.23 illustrates a typical format, that of the DM 1/2 multiplexer. We have here bit-by-bit 
interleaving of four channels each at a rate of 1.544 Mbit/s. The main frame (multiframe) 
consists of four subframes. Each subframe has six overhead bits: for example the subframe 
1 (first line in Fig. 6.23) has overhead bits Mo, Ca, Fq, Ca, Ca, and Fi. In between these 
overhead bits are 48 interleaved data bits from the four channels (12 data bits from each 
channel). We begin with overhead bit Mo, followed by 48 multiplexed data bits, then add 
a second overhead bit Ca followed by the next 48 multiplexed bits, and so on. Thus, there 
are a total of 48 x 6 x 4 = 1152 data bits and 6 x 4 — 24 overhead bits making a total 
1176 bits/frame. The efficiency is 1152/1176 2 ^ 98%. The overhead bits with subscript 0 
are always 0 and those with subscript 1 are always 1. Thus, Mo, Fq are all Os and M\ and 
Fj are all Is. The F digits are periodic 010101 ... and provide the main framing pattern, 
which the multiplexer uses to synchronize on the frame. After locking onto this pattern, the 
demultiplexer searches for the 0111 pattern formed by overhead bits MqMi Mi Mi. This further 
identifies the four subframes, each corresponding to a line in Fig. 6.23. It is possible, although 
unlikely, that signal bits wilt also have a pattern 101010. ... The receiver could lock onto this 
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Figure 6.22 
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Figure 6.23 
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wrong sequence. The presence of MqMjMiMi provides verification of the genuine F 0 F 1 F 0 F 1 
sequence. The C bits are used to transmit additional information about bit stuffing, as discussed 
later. 

In the majority of cases, not all incoming channels are active all the time: some transmit 
data, and some are idle. This means the system is underutilized. We can, therefore, accept more 
input channels to take advantage of the inactivity, at any given time, of at least one channel. 
This obviously involves much more complicated switching operations, and also rather careful 
system planning. In any random traffic situation we cannot guarantee that the number of 
transmission channels demanded will not exceed the number available; but by taking account 
of the statistics of the signal sources, it is possible to ensure an acceptably low probability of 
this occurring. Multiplex structures of this type have been developed for satellite systems and 
are known as time division multiple-access (TDMA) systems. 

In TDMA systems employed for telephony, the design parameters are chosen so that any 
overload condition lasts only a fraction of a second, which leads to acceptable performance 
for speech communication. For other types of data and telegraphy, transmission delays are 
unimportant. Hence, in overload condition, the incoming data can be stored and transmitted 
later. 


6.4.2 Asynchronous Channels and Bit Stuffing 

In the preceding discussion, we assumed synchronization between all the incoming channels 
and the multiplexer. This is difficult even when all the channels are nominally at the same 
rate. For example, consider a 1000 km coaxial cable carrying 2 x 10 s pulses per second. 
Assuming the nominal propagation speed in the cable to be 2 x 10 s m/s, it takes 1/200 second 
of transit time and 1 million pulses will be in transit. If the cable temperature increases by 1°F, 
the propagation velocity will increase by about 0.01%. This will cause the pulses in transit 
to arrive sooner, thus producing a temporary increase in the rate of pulses received. Because 
the extra pulses cannot be accommodated in the multiplexer, they must be temporarily stored 
at the receiver. If the cable temperature drops, the rate of received pulses will drop, and the 
multiplexer will have vacant slots with no data. These slots need to be stuffed with dummy 
digits (pulse stuffing). 

DS1 signals in the North American network are often generated by crystal oscillators 
in individual channel banks or other digital terminal equipment. Although the oscillators are 
quite stable, they will not oscillate at exactly the same frequency, leading to another cause of 
asynchronicity in the network. 

This shows that even in synchronously multiplexed systems, the data are rarely received 
at a synchronous rate. We always need a storage (known as an elastic store) and pulse stuffing 
(also known as justification) to accommodate such an situation. Obviously, this method of an 
elastic store and pulse stuffing will work even when the channels are asynchronous. 

Three variants of the pulse stuffing scheme exist: (1) positive pulse stuffing, (2) negative 
pulse stuffing, and (3) positive/negative pulse stuffing. In positive pulse stuffing, the multiplexer 
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Figure 6.24 
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rate is higher than that required to accommodate all incoming tributaries at their maximum 
rate. Hence, the time slots in the multiplexed signal will become available at a rate exceeding 
that of the incoming data so that the tributary data will tend to lag (Fig. 6*24). At some stage, 
the system will decide that this lag has become great enough to require pulse stuffing. The 
information about the stuffed-pulse positions is transmitted through overhead bits. From the 
overhead bits, the receiver knows the stuffed-pulse position and eliminates that pulse. 

Negative pulse stuffing is a complement of positive pulse stuffing. The time slots in the 
multiplexed signal now appear at a slightly slower rate than those of the tributaries, and thus 
the multiplexed signal cannot accommadate all the tributary pulses. Information about any 
left-out pulse and its position is transmitted through overhead bits. The positive/negative pulse 
stuffing is a combination of the first two schemes. The nominal rate of the multiplexer is equal 
to the nominal rate required to accommodate all incoming channels. Hence, we may need 
positive pulse stuffing at some times and negative stuffing at others. All this information is 
sent through overhead bits. 

The C digits in Fig. 6.23 are used to transmit stuffing information. Only one stuffed bit 
per input channel is allowed per frame. This is sufficient to accommodate expected variations 
in the input signal rate. The bits Ca convey information about stuffing in channel A, bits Cb 
convey information about stuffing in channel B, and so on. The insertion of any stuffed pulse in 
any one subframe is denoted by setting all the three Cs in that line to 1. No stuffing is indicated 
by using Os for all the three Cs. Tf a bit has been stuffed, the stuffed bit is the first information 
bit associated with the immediate channel following the Fi bit, that is, the first such bit in the 
last 48-bit sequence in that subframe. For the first subframe, the stuffed bit will immediately 
follow the Fi bit. For the second subframe, the stuffed bit will be the second bit following the 
F] bit, and so on. 


6.4,3 Plesiochronous (almost Synchronous) 

Digital Hierarchy 

We now present the digital hierarchy developed by the Bell System and currently included 
in the ANSI standards for telecommunications (Fig. 6.25). The North American hierarchy is 
implemented in North America and Japan. 

Two major classes of multiplexers are used in practice. The first category is used for 
combining low-data-rate channels. It multiplexes channels of rates of up to 9600 bit/s into a 
signal of data rate of up to 64 kbit/s. The multiplexed signal, called “digital signal level 0” 
(DSO) in the North American hierarchy, is eventually transmitted over a voice-grade channel. 
The second class of multiplexers is at a much higher bit rate. 
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Figure 6.25 
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There are four orders, or levels, of multiplexing. The first level is the T1 multiplexer 
or channel bank, consisting of 24 channels of 64 kbit/s each. The output of this multiplexer 
is a DS1 (digital level 1) signal at a rate of 1.544 Mbit/s. Four DS1 signals are multiplexed 
by a DM 1/2 multiplexer to yield a DS2 signal at a rate 6312 Mbit/s. Seven DS2 signals are 
multiplexed by a DM2/3 multiplexer to yield a DS3 signal at a rate of 44.736 Mbit/s. Finally, 
three DS3 signals are multiplexed by a DM3/4NA multiplexer to yield a DS4NA signal at a 
rate 139,264 Mbit/s. There is also a lower rate multiplexing hierarchy, known as the digital 
data system (DDS), which provides standards for multiplexing digital signals with rates as 
low as 2,4 kbit/s into a DSO signal for transmission through the network. 

The inputs to a T1 multiplexer need not be restricted only to digitized voice channels 
alone. Any digital signal of 64 kbit/s of appropriate format can be transmitted. The case of 
the higher levels is similar. For example, all the incoming channels of the DM 1/2 multiplexer 
need not be DS1 signals obtained by multiplexing 24 channels of 64 kbit/s each. Some of them 
may be 1,544 Mbit/s digital signals of appropriate format, and so on. 

In Europe and many other parts of the world, another hierarchy, recommended by the 
ITU as an standard, has been adopted. This hierarchy, based on multiplexing 30 telephone 
channels of 64 kbit/s (E-G channels) into an E-1 carrier at 2.048 Mbit/s (30 channels) is shown 
in Fig. 6.26, Starting from the base level of E-l, four lower level lines form one higher level line 
progressively, generating an E-2 line with data throughput of 8.448 Mbit/s, an E-3 line with 
data throughput of 34,368 Mbit/s, an E-4 line line with data throughput of 139.264 Mbit/s, and 
an E-5 line with data throughput of 565.148 Mbit/s. Because different networks must be able 
to interface with one another across the three different systems (North American, Japanese, 
and other) in the world, Fig. 6.26 demonstrates the relative relationship and the points of their 
common interface. 
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Figure 6.26 
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6.5 DIFFERENTIAL PULSE CODE 
MODULATION (DPCM) 

PCM is not a very efficient system because it generates so many bits and requires so much 
bandwidth to transmit. Many different ideas have been proposed to improve the encoding 
efficiency of A/D conversion. In general, these ideas exploit the characteristics of the source 
signals. DPCM is one such scheme. 

In analog messages we can make a good guess about a sample value from knowledge of 
past sample values. In other words, the sample values are not independent, and generally there 
is a great deal of redundancy in the Nyquist samples. Proper exploitation of this redundancy 
leads to encoding a signal with fewer bits. Consider a simple scheme; instead of transmitting the 
sample values, we transmit the difference between the successive sample values. Thus, if m[k] is 
the kih sample, instead of transmitting ro[£], we transmit the difference d[k] = m[k\—m\k- 1]. 
At the receiver, knowing d[k] and several previous sample value m[k — 1], we can reconstruct 
m[k]* Thus, from knowledge of the difference d[k], we can reconstruct m[k] iteratively at the 
receiver. Now, the difference between successive samples is generally much smaller than the 
sample values. Thus, the peak amplitude m p of the transmitted values is reduced considerably. 
Because the quantization interval Av = m p /L , for a given L (or rc), this reduces the quantization 
interval Av, thus reducing the quantization noise, which is given by Av 2 /12, This means that 
for a given n (or transmission bandwidth), we can increase the SNR, or for a given SNR, we 
can reduce n (or transmission bandwidth). 

We can improve upon this scheme by estimating (predicting) the value of the £th sample 
w[£] from a knowledge of several previous sample values. Tf this estimate is m[k], then we 
transmit the difference (prediction error) d [A] = m\k] —m[k]. At the receiver also, we determine 
the estimate m[k] from the previous sample values, and then generate m[k] by adding the 
received d[k] to the estimate m[k\* Thus, we reconstruct the samples at the receiver iteratively. 
If our prediction is worth its salt, the predicted (estimated) value m[£] will be close to 
and theiT difference (prediction error) d[k] will be even smaller than the difference between 
the successive samples. Consequently, this scheme, known as the differential PCM (DPCM), 
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is superior to the naive prediction described in the preceding paragraph, which is a special case 
of DPCM, where the estimate of a sample value is taken as the previous sample value, that is, 
m[k] = m[k - 1]* 

Spirits of Taylor, Maclaurim and Wiener 

Before describing DPCM, we shall briefly discuss the approach to signal prediction (estima¬ 
tion). To the uninitiated, future prediction seems like mysterious stuff, fit only for psychics, 
wizards, mediums, and the like, who can summon help from the spirit world. Electrical 
engineers appear to be hopelessly outclassed in this pursuit. Not quite so! We can also sum¬ 
mon the spirits of Taylor, Maclaurin, Wiener, and the like to help us. What is more, unlike 
Shakespeare's spirits, our spirits come when called.* Consider, for example, a signal m{t), 
which has derivatives of all orders at f. Using the Taylor series for this signal, we can express 
m{t + T s ) as 


j' 2 'j' 3 

m(t + T,;) — m(t) + T s m(t) + H- (6.42a) 

^ m(t) 4- T 5 m{f) for small T s (6.42b) 

Equation (6.42a) shows that from a knowledge of the signal and its derivatives at instant f, we 
can predict a future signal value at t + TV In fact, even if we know just the first derivative, 
we can still predict this value approximately, us shown in Eq. (6.42b). Let us denote the kth 
sample of m(t) by m[£], that is, m(kT x ) = m[£], and m(kT s ±T S ) = m[k ± 1J, and so on. 
Setting t — kTx in Eq. (6.42b), and recognizing that m{kT s ) [m(kT x ) - m(kT s - T s )]fT s , we 
obtain 


m[k H- 1J 


m[k ] H- 7; 


m[&] — m[k — 1] 

tT 


= 2 m[k\-m[k - 1] 


This shows that we cun find a crude prediction of the (k 4- l)th sample from the two previous 
samples* The approximation in Eq. (6.42b) improves as we add more terms in the series on 
the right-hand side. To determine the higher order derivatives in the series, we require more 
samples in the past. The larger the number of past samples we use, the better will be the 
prediction* Thus, in general, we can express the prediction formula as 

m[k\ & a\m[k - 1] + ajm^k — 2] H-b a^mlk — N ] (6.43) 

The right-hand side is m[k ], the predicted value of m[k\ Thus, 

m[k] — a\m[k — 1] -j- a 2 fti[k — 2] 4- ■ ■ ■ 4- a^m[k - N] (6*44) 

This is the equation of an Nth-order predictor Larger N would result in better prediction in 
general. The output of this filter (predictor) is m[£l, the predicted value of The input 
consists of the previous samples m[k - 1], m[k - 2], . *., m[k - N]* although it is customary 
to say that the input is m{fc\ and the output is m[k]. Observe that this equation reduces to 


* From Shakespeare, Henry IV, Part 1, Act III, Scene l: 
Glendower: / can call the spirits fmm vasty deep. 
Hotspur: Why, so can l, or so can any man; 

But will they come when you do call for them? 
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Figure 6.27 
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m[k ] = m[k - 1] in the case of the first-order prediction. It follows from Eq. (6.42b), where 
we retain only the first term on the right-hand side. This means that a\ = 1, and the first-order 
predictor is a simple time delay. 

We have outlined here a very simple procedure for predictor design. In a more sophisticated 
approach, discussed in Sec. 8.5, where we use the minimum mean squared error criterion for 
best prediction, the prediction coefficients aj in Eq. (6.44) are determined from the statistical 
correlation between various samples. The predictor described in Eq. (6.44) is called a linear 
predictor . It is basically a transversal filter (a tapped delay line), where the tap gains are set 
equal to the prediction coefficients, as shown in Fig, 6.27. 

Analysis of DPCM 

As mentioned earlier, in DPCM we transmit not the present sample m\k\ but d[k] (the 
difference between m[k] and its predicted value m[k}). At the receiver, we generate m[fc\ 
from the past sample values to which the received d[k] is added to generate m[k\. 
There is, however, one difficulty associated with this scheme. At the receiver, instead of 
the past samples m[k - 1], m{k - 2], ..., as well as d[k], we have their quantized ver¬ 
sions m q [k — 1], m q [k — 2],-Hence, we cannot determine m[k]. We can determine only 

m q [k], the estimate of the quantized sample m q [k ], in terms of the quantized samples 

m q [k — 1], m q [k — 2],-This will increase the error in reconstruction. In such a case, abetter 

strategy is to determine m q [k\, the estimate of m q [k ] (instead of m[A:]), at the transmitter also 

from the quantized samples m q [k — 1], m q [k - 2], _The difference d[k] — m[k] — m q {k\ is 

now transmitted via PCM, At the receiver, we can generate m q [k\ and from the received d[k], 
we can reconstruct m q [k}> 

Figure 6.28a shows a DPCM transmitter. We shall soon show that the predictor input is 
m q [k]. Naturally, its output is m q [k\, the predicted value of m q [k]. The difference 

d[k] = m[k\ — m q [k] (6.45) 


is quantized to yield 


d q [k\=d[k\ + q[k] (6.46) 

where q\_k ] is the quantization error. The predictor output m q [k] is fed back to its input so that 
the predictor input m q [k ] is 


m q [k ] = m q [k ] + d q \k ] 

= m[k] - d[k] + d q [k] 
— m[k] + q[k] 


(6.47) 
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Figure 6,28 

DPCM system: 

(a) transmitter; 

(b) receiver. 
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This shows that fn q [k] is a quantized version of m\k]. The predictor input is indeed m q [k), as 
assumed. The quantized signal d q [k\ is now transmitted over the channel* The receiver shown 
in Fig. 6.28b is identical to the shaded portion of the transmitter. The inputs in both cases are 
also the same, namely, d q [k]> Therefore, the predictor output must be m q [k\ (the same as the 
predictor output at the transmitter)* Hence, the receiver output (which is the predictor input) is 
also the same, viz., m q [k\ — m[k] as found in Eq. (6.47). This shows that we are able to 

receive the desired signal m[k] plus the quantization noise q[k ]. This is the quantization noise 
associated with the difference signal d[k], which is generally much smaller than m[k]. The 
received samples m q [k] are decoded and passed through a low-pass filter for D/A conversion. 


SNR Improvement 

To determine the improvement in DPCM over PCM, let m p and d p be the peak amplitudes 
of m(t) and d(t), respectively. If we use the same value of L in both cases, the quantization 
step Av in DPCM is reduced by the factor d p /m pt Because the quantization noise power is 
(Av) 2 /12, the quantization noise in DPCM is reduced by the factor ( m p /d p ) 2 , and the SNR 
is increased by the same factor. Moreover, the signal power is proportional to its peak value 
squared (assuming other statistical properties invariant)* Therefore, G p (SNR improvement 
due to prediction) is at least 


G p 


Prn 

Pd 


where P m and Pd are the powers of m(t) and d(t), respectively. In terms of decibel units, this 
means thatthe SNR increases by 10 log 10 (P m /P^) dB, Therefore, Eq. (6.41) applies to DPCM 
also with a value of a that is higher by 30 log 10 (F m //^) dB* In Example 8.24, a second-order 
predictor processor for speech signals is analyzed. For this case, the SNR improvement is 
found to be 5.6 dB. In practice, the SNR improvement may be as high as 25 dB in such cases 
as short-term voiced speech spectra and in the spectra of low-activity images. 12 Alternately, 
for the same SNR, the bit rate for DPCM could be lower than that for PCM by 3 to 4 bits per 
sample* Thus, telephone systems using DPCM can often operate at 32 or even 24 kbit/s. 
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6.6 ADAPTIVE DIFFERENTIAL PCM (ADPCM) 

Adaptive DPCM (ADPCM) can further improve the efficiency of DPCM encoding by incor¬ 
porating an adaptive quantizer at the encoder. Figure 6.29 illustrates the basic configuration 
of ADPCM. For practical reasons, the number of quantization level L is fixed. When a fixed 
quantization step Av is applied, either the quantization error is too large because Av is too 
big or the quantizer cannot cover the necessary signal range when Av is too small. Therefore, 
it would be better for the quantization step Av to be adaptive so that Av is large or small 
depending on whether the prediction error for quantizing is large or small. 

It is important to note that the quantized prediction error d q [k J can be a good indicator of 
the prediction error size. For example, when the quantized prediction error samples vary dose 
to the largest positive value (or the largest negative value), it indicates that the prediction error 
is large and Av needs to grow. Conversely, if the quantized samples oscillate near zero, then 
the prediction error is small and Av needs to decrease. It is important that both the modulator 
and the receiver have access to the same quantized samples. Hence, the adaptive quantizer and 
the receiver reconstruction can apply the same algorithm to adjust the Av identically. 

Compared with DPCM, ADPCM can further compress the number of bits needed for a 
signal waveform. For example, it is very common in practice for an 8-bit PCM sequence to be 
encoded into a 4-bit ADPCM sequence at the same sampling rate. This easily represents a 2:1 
bandwidth or storage reduction with virtually no loss. 

ADPCM encoder has many practical applications. The ITU-T standard G726 specifies an 
ADPCM speech coder and decoder (called codec) for speech signal samples at 8 kHz. 7 The 
G726 ADPCM predictor uses an eighth-order predictor. For different quality levels, G.726 
specifies four different ADPCM rates at 16, 24, 32, and 40 kbit/s. They correspond to four 
different bit sizes for each speech sample at 2 bits, 3 bits, 4 bits, and 5 bits, respectively, or 
equivalently, quantization levels of 4, 8, 16, and 32, respectively. 

The most common ADPCM speech encoders use 32 kbit/s. In practice, there are multiple 
variations of ADPCM speech codec. In addition to the ITU-T G726 specification, 7 these 
include the OKI ADPCM codec, the Microsoft ADPCM codec supported by WAVE players, 
and the Interactive Multimedia Association (IMA) ADPCM, also known as the DVT ADPCM. 
The 32 kbit/s ITU-T G726 ADPCM speech codec is widely used in the DECT (digital enhanced 
cordless telecommunications) system, which itsetf is widely used for residential and business 
cordless phone communications. Designed for short-range use as an access mechanism to the 
main networks, DECT offers cordless voice, fax, data, and multimedia communications. DECT 
is now in use in over 100 countries worldwide. Another major user of the 32 kbit/s ADPCM 
codec is the Personal Handy-phone System (or PHS), also marketed as the Personal Access 
System (PAS) and known as Xiaolingtong in China. 

PHS is a mobile network system similar to a cellular network, operating in the 1880 to 1930 
MHz frequency band, used mainly in Japan, China, Taiwan, and elsewhere in Asia. Originally 
developed by the NTT Laboratory in Japan in 1989, PHS is much simpler to implement and 

Figure 6.29 

ADPCM encoder 
uses on adaptive 
quantizer 
controlled only 
by the encoder 
output bits. 







67 Delta Modulation 295 


deploy. Unlike cellular networks, PHS phones and base stations are low-power, short-range 
facilities. The service is often pejoratively called the “poor man's cellular” because of its 
limited range and poor roaming ability. PHS first saw limited deployment (NTT-Personal, 
DDI-Pocket, and ASTEL) in Japan in 1995 but has since nearly disappeared. Surprisingly, 
PHS has seen a resurgence in markets like China, Taiwan, Vietnam, Bangladesh, Nigeria, 
Mali, Tanzania, and Honduras, where its low cost of deployment and hardware costs offset 
the system's disadvantages. In China alone, there was an explosive expansion of subscribers, 
reaching nearly 80 million in 2006. 


6.7 DELTA MODULATION 

Sample correlation used in DPCM is further exploited in delta modulation (DM) by oversam¬ 
pling (typically four times the Nyquist rate) the baseband signal. This increases the correlation 
between adjacent samples, which results in a small prediction error that can be encoded using 
only one bit (L = 2). Thus, DM is basically a 1-bit DPCM, that is, a DPCM that uses only two 
levels (L = 2) for quantization of m|£] - m q [k ]. In comparison to PCM (and DPCM), it is a 
very simple and inexpensive method of A/D conversion. A 1-bit codeword in DM makes word 
framing unnecessary at the transmitter and the receiver. This strategy allows us to use fewer 
bits per sample for encoding a baseband signal. 

In DM, we use a first-order predictor, which, as seen earlier, is just a time delay of T ? 
(the sampling interval). Thus, the DM transmitter (modulator) and receiver (demodulator) are 
identical to those of the DPCM in Fig. 6.28, with a time delay for the predictor, as shown in 
Fig. 6.30, from which we can write 

m q \k\ = m q [k - 1] + d q [k] (6.48) 

Hence, 

m q [k — 1] = mq[k — 2] H- d q [k — 1 ] 

Figure 6.30 

Delta modulation 
is a special case 
of DPCM. 


(a) 



(b) 
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Substituting this equation into Eq, (6.48) yields 

m q [k\ — m q \k - 2] -h d q [k] + d q [k - 1] 

Proceeding iteratively in this manner, and assuming zero initial condition, that is, = 0, 
we write 

k 

(6.49) 

m=0 

This shows that the receiver (demodulator) is just an accumulator (adder). If the output 
d q [k] is represented by impulses, then the accumulator (receiver) may be realized by an integra¬ 
tor because its output is the sum of the strengths of the input impulses (sum of the areas under 
the impulses). We may also replace with an integrator the feedback portion of the modulator 
(which is identical to the demodulator). The demodulator output is m q [k\ t which when passed 
through a low-pass filter yields the desired signal reconstructed from the quantized samples. 

Figure 6.31 shows a practical implementation of the delta modulator and demodulator. 
As discussed earlier, the first-order predictor is replaced by a low-cost integrator circuit (such 
as an RC integrator). The modulator (Fig t 6.31a) consists of a comparator and a sampler in 
the direct path and an integrator-amplifier in the feedback path. Let us see how this delta 
modulator works. 

The analog signal m{t) is compared with the feedback signal (which serves as a predicted 
signal) m^(r) + The error signal d(t) = m(t) - m q (t) is applied to a comparator. If d(t) is positive, 
the comparator output is a constant signal of amplitude E , and if d ( t ) is negative, the comparator 
output is —E. Thus, the difference is a binary signal (L — 2) that is needed to generate a 1-bit 
DPCM. The comparator output is sampled by a sampler at a rate of/* samples per second, 
whereis typically much higher than the Nyquist rate. The sampler thus produces a train 
of narrow pulses d q [k] (to simulate impulses) with a positive pulse when m{t) > m q (t) and a 
negative pulse when m(t) < m q {t). Note that each sample is coded by a single binary pulse 
(l-bit DPCM), as required. The pulse train d q [k] is the delta-modulated pulse train (Fig. 6.3 Id). 
The modulated signal d q [k\ is amplified and integrated in the feedback path to generate m q (t) 
(Fig. 6.31e), which tries to follow m(t). 

To understand how r this works, we note that each pulse in d q [k] at the input of the integrator 
gives rise to a step function (positive or negative, depending on the pulse polarity) in . If, 
for example, m{t) > m q (t) y a positive pulse is generated in d q [k\ which gives rise to a positive 
step in m q (t), trying to equalize m q (t) to m(t) in small steps at every sampling instant, as shown 
in Fig. 6.31c. It can be seen that m q (t) is a kind of staircase approximation of m(t). When ih q (t) 
is passed through a low-pass filter, the coarseness of the staircase in m q (t) is eliminated, and 
we get a smoother and better approximation to m(t ). The demodulator at the receiver consists 
of an amplifier-integrator (identical to that in the feedback path of the modulator) followed by 
a low-pass filter (Fig. 6.31b). 

DM Transmits the Derivative of m(t) 

In PCM, the analog signal samples are quantized in L levels, and this information is transmitted 
by n pulses per sample (n = 3og 2 L) + A little reflection shows that in DM, the modulated signal 
carries information not about the signal samples but about the difference between successive 
samples. If the difference is positive or negative, a positive or a negative pulse (respectively) 
is generated in the modulated signal d g [k). Basically, therefore, DM carries the information 
about the derivative of m(t ), hence, the name “delta modulation.” This can also be seen from 
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Figure 6,31 
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the fact that integration of the delta-modulated signal yields m q (t), which is an approximation 
of m(t). 

In PCM, the information of each quantized sample is transmitted by an rc-bit code word, 
whereas in DM the information of the difference between successive samples is transmitted 
by a 1-bit code word. 

Threshold of Coding and Overloading 

Threshold and overloading effects can be clearly seen in Fig. 6.31c. Variations in m(r) smaller 
than the step value (threshold of coding) are lost in DM. Moreover, if m(t) changes too fast. 
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that is, if m(t) is too high, m q {t) cannot follow m(f), and overloading occurs. This is the 
so-called slope overload, which gives rise to the slope overload noise. This noise is one 
of the basic limiting factors in the performance of DM. We should expect slope overload 
rather than amplitude overload in DM, because DM basically carries the information about 
m(r). The granular nature of the output signal gives rise to the granular noise similar to the 
quantization noise. The slope overload noise can be reduced by increasing E (the step size). 
This unfortunately increases the granular noise. There is an optimum value of £, which yields 
the best compromise giving the minimum overall noise. This optimum value of £ depends on 
the sampling frequency f s and the nature of the signal. 12 

The slope overload occurs when m q {t) cannot follow m(t ), During the sampling interval 
T s , rh q (t) is capable of changing by E , where E is the height of the step. Hence, the maximum 
slope that m q (t) can follow is E/T^ or Ef s , where f s is the sampling frequency. Hence, no 
overload occurs if 


Consider the case of tone modulation (meaning a sinusoidal message): 

m(t) = A cos a)t 


The condition for no overload is 


MO Imax <Ef s (6.50) 

Hence, the maximum amplitude A max of this signal that can be tolerated without overload is 
given by 


= — (6.51) 

w 

The overload amplitude of the modulating signal is inversely proportional to the frequency 
to. For higher modulating frequencies, the overload occurs for smaller amplitudes. For voice 
signals, which contain all frequency components up to (say) 4 kHz, calculating A ^ by using 
to = 2tt x 4000 in Eq. (6.51) will give an overly conservative value, it has been shown by de 
Jager 13 that A max for voice signals can be calculated by using ov ^2tt x 800 in Eq. (6.51), 


M 1 ■ 

Ivojce — 


(6.52) 


Thus, the maximum voice signal amplitude that can be used without causing slope 
overload in DM is the same as the maximum amplitude of a sinusoidal signal of reference 
frequency/r ( f r ^ 800 Hz) that can be used without causing slope overload in the same system. 

Fortunately, the voice spectrum (as well as the television video signal) also decays with 
frequency and closely follows the overload characteristics (curve t\ Fig. 6.32). For this reason, 
DM is well suited for voice (and television) signals. Actually, the voice signal spectrum (curve 
b) decreases as l/to up to 2000 Hz, and beyond this frequency, it decreases as 1 jo?. If we 
had used a double integration in the feedback circuit instead of a single integration, A max in 
Eq. (6.51) would be proportional to \j(?. Hence, a better match between the voice spectrum 
and the overload characteristics is achieved by using a single integration up to 2000 Hz and 
a double integration beyond 2000 Hz. Such a circuit (the double integration) responds fast 
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Figure 6.32 

Voice signal 
spectrum. 



but has a tendency to instability, which can be reduced by using some low-order prediction 
along with double integration. A double integrator can be built by placing in cascade two 
low-pass RC integrators with time constants R]C\ = l/200;r and R 2 C 2 = l/4000;r, respec¬ 
tively, This results in single integration from 100 to 2000 Hz and double integration beyond 
2000 Hz. 


Sigma-Delta Modulation 

While discussing the threshold of coding and overloading, we illustrated that the essence of 
the conventional DM is to encode and transmit the derivative of the analog message signal. 
Hence, the receiver of DM requires an integrator as shown in Fig, 6.31 and also, equivalently, 
in Fig. 6.33a. Since signal transmission inevitably is subjected to channel noise, such noise 
will be integrated and will accumulate at the receiver output, which is a highly undesirable 
phenomenon that is a major drawback of DM. 

To overcome this critical drawback of DM, a small modification can be made. First, we 
can view the overall DM system consisting of the transmitter and the receiver as approximately 
distortionless and linear. Thus, one of its serial components, the receiver integrator 1 fs, may 
be moved to the front of the transmitter (encoder) without affecting the overall modulator and 
demodulator response, as shown in Fig. 6,33b. Finally, the two integrators can be merged into 
a single one after the subtractor, as shown in Fig. 6.33c. This modified system is known as the 
sigma-delta modulation (E-AM) * 

As we found in the study of preemphasis and deemphasis filters in FM, because channel 
noise and the message signal do not follow the same route, the order of serial components 
in the overall modulation-demodulation system can have different effects on the SNR. The 
seemingly minor move of the integrator 1 /s in fact has several major advantages: 

* The channel noise no longer accumulates at the demodulator. 

* The important low'-frequency content of the message m(r) is preemphasized by the integrator 
1 /jco. This helps many practical signals (such as speech) whose low-frequency components 
are more important, 

* The integrator effectively smooths the signal for encoding (Fig. 6.33b). Hence, overloading 
becomes less likely. 

* The low-pass nature of the integrator increases the correlation between successive samples, 
leading to smaller encoding error. 

* The demodulator is simplified. 
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Figure 6.33 

(a) Conventional 
delta modulator, 
[bj I-A 
modulator. 

(c) Simpler T-A 
modulator. 



(c) 


Adaptive Delta Modulation (ADM) 

The DM discussed so far suffers from one serious disadvantage. The dynamic range of ampli¬ 
tudes is too small because of the threshold and overload effects discussed earlier. To address 
this problem, some type of signal compression is necessary. In DM, a suitable method appears 
to be the adaptation of the step value E according to the level of the input signal derivative. 
For example, in Fig. 6.31, when the signal m(t) is falling rapidly, slope overload occurs. If we 
can increase the step size during this period, the overload could be avoided. On the other hand, 
if the slope of mil) is small, a reduction of step size will reduce the threshold level as well as 
the granular noise. The slope overload causes d q [k] to have several pulses of the same polarity 
in succession. This calls for increased step size. Similarly, pulses in d q [k ] alternating contin¬ 
uously in polarity indicates small-amplitude variations, requiring a reduction in step size. In 
ADM we detect such pulse patterns and automatically adjust the step size, 14 This results in a 
much larger dynamic range for DM. 


6.8 VOCODERS AND VIDEO COMPRESSION 

PCM, DPCM, ADPCM, DM, and E-AM are all examples of what are knowm as waveform 
source encoders. Basically, waveform encoders do not take into consideration how the signals 
for digitization are generated. Hence, the amount of compression achievable by waveform 
encoders is highly limited by the degree of correlation between successive signal samples. 
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Figure 6.34 
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For a low-pass source signal with finite bandwidth B Hz, even if we apply the minimum 
Nyquist sampling rate 2 B Hz and 1-bit encoding, the bit rate cannot be lower than 2 B bit/s* 
There have been many successful methods introduced to drastically reduce the source coding 
rates of speech and video signals, very important to our daily communication needs. Unlike 
waveform encoders, the most successful speech and video encoders are based on the human 
physiological models involved in speech generation and in video perception. Here we describe 
the basic principles of the linear prediction voice coders (known as vocoders) and the video 
compression method proposed by the Moving Picture Experts Group (MPEG). 

6.8.1 Linear Prediction Coding Vocoders 

Voice Models and Model-Based Vocoders 

Linear prediction coding (LPC) vocoders are model-based systems* The model, in turn, is 
based on a good understanding of the human voice mechanism. Fig. 634a provides a cross- 
sectional illustration of the human speech apparatus* Briefly, human speech is produced by 
the joint interaction of lungs, vocal cords, and the articulation tract, consisting of the mouth 
and the nose cavity. Based on this physiological speech model, human voices can be divided 
into voiced and the unvoiced sound categories* Voiced sounds are those made while the vocal 
cords are vibrating. Put a finger on your Adam's apple* while speaking, and you can feel the 
vibration the vocal cords when you pronounce all the vowels and some consonants, such as g 
as in gut , b as in but , and n as in nut * Unvoiced sounds are made while the vocal cords are not 
vibrating* Several consonants such as k , p t and i are unvoiced. Examples of unvoiced sounds 
include h in hut, c in cut , and p in put * 

For the production of voiced sounds, the lungs expel air through the epiglottis, causing 
the vocal cords to vibrate. The vibrating vocal cords interrupt the airstream and produce a 
quasi-periodic pressure wave consisting of impulses. The pressure wave impulses are com¬ 
monly called pitch impulses, and the frequency of the pressure signal is the pitch frequency or 
fundamental frequency as shown in Fig. 6*34b. This is the part of the voice signal that defines 
the speech tone* Speech that is uttered in a constant pitch frequency sounds monotonous. In 
ordinary cases, the pitch frequency of a speaker varies almost constantly, often from syllable 
to syllable* 


* The slight projection at the front of the throat formed by the largest cartilage of the lary nx, usually more prominent 
in men than in women. 
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For voiced sound, the pitch impulses stimulate the air in the vocal tract (mouth and nasal 
cavities). For unvoiced sounds, the excitation comes directly from the air flow. Extensive 
studies 15-17 have shown that for unvoiced sounds, the excitation to the vocal tract is more like a 
broadband noise. When cavities in the vocal tract resonate under excitation, they radiate a sound 
wave, which is the speech signal. Both cavities form resonators with characteristic resonance 
frequencies (formant frequencies). Changing the shape (hence the resonant characteristics) of 
the mouth cavity allows different sounds to be pronounced. Amazingly, this (vocal) articulation 
tract can be approximately modeled by a simple linear digital filter with an all-pole transfer 
function 


H{z) 


A{z) 



where g is a gain factor and A (z) is known as the prediction filter, much like the feedback filter 
used in DPCM and ADPCM, One can view the function of the vocal articulation apparatus as 
a spectral shaping filter H(z)< 


LPC Models 

Based on this human speech model, a voice encoding approach different from waveform coding 
can be established. Instead of sending actual signal samples, the model-based vocoders analyze 
the voice signals segment by segment to determine the best-fitting speech model parameters. 
As shown in Fig. 6.35, after speech analysis, the transmitter sends the necessary speech model 
parameters (formants) for each voice segment to the receiver. The receiver then uses the 
parameters for the speech model to set up a voice synthesizer to regenerate the respective 
voice segments. In other words, what a user hears at the receiver actually consists of signals 
reproduced by an artificial voice synthesizing machine'. 

In the analysis of a sampled voice segment (consisting of multiple samples), the pitch 
analysis will first determine whether the speech is a voiced or an unvoiced piece. If the signal 
is classified as “voiced,” the pitch analyzer will estimate pitch frequency (or equivalently the 
pitch period). In addition, the LPC analyzer will estimate the all-pole filter coefficients in A(z). 
Because the linear prediction error indicates how well the linear prediction filter fits the voice 
samples, the LPC analyzer can determine the optimum filter coefficients by minimizing the 
mean square error (MSE) of the linear prediction error. 39 

Directly transmitting the linear prediction (LP) filter parameters is unsound because the 
filter is very sensitive to parameter errors due to quantization and channel noises. Worse yet, 
the LP filter may even become unstable because of small coefficient errors. In practice, the 
stability of this all-pole linear prediction (LP) filter can be ensured by utilizing the modular 
lattice filter structure through the well-known Levinson-Durbin algorithm. 20,21 Lattice filter 
parameters, known as reflection coefficients {r^}, are less sensitive to quantization errors and 


Figure 6*35 
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TABLE 6.1 

Quantization Bit Allocation in LPC-10 Vocoder 

Pitch Period Voiced/Unvoiced Gain g 10 LP Filler Parameters, bits/coefficient 

r\-r A rg-rg rg no 

5 bits 4 bits 3 bits 2 bits Voiced 

6 bits 1 bit 5 bits 5 bits Not used Unvoiced 

noise. Transmission is further improved by sending their log-area ratios (LAR), defined as 

a , 1 + r k 

o k = log -- 

i - n 

or by sending intermediate values from the Levin son-Durbin recursion known as the partial 
reflection coefficients (PARCOR), Another practical approach is to find the equivalent line 
spectral pairs {LSP) as representation of the LPC filter coefficients for transmission over 
channels. LSP has the advantage of low sensitivity to quantization noise. 22 - 23 As long as 
the /jth-order all-pole LP filter is stable, it can be represented by p real-valued, line spectral 
frequencies. In every representation, however, a pth-order synthesizer filter can be obtained 
by the LPC decoder from the quantization of p real-valued coefficients. In general 8 to 14 LP 
parameters are sufficient for vocal tract representation. 

We can now use a special LPC example to illustrate the code efficiency of such model- 
based vocoders. In the so-called LPC-10 vocoder,* the speech is sampled at 8 kHz. 180 samples 
(22.5 ms) form an LPC frame for transmission. 24 The bits per speech frame are allocated to 
quantize the pitch period, the voiced/unvoiced flag, the filter gain, and the 10 filter coefficients, 
according to Table 6.1. Thus, each frame requires between 32 (unvoiced) and 53 (voiced) bits. 
Adding frame control bits results an average coded stream of 54 bits per speech frame, or an 
overall rate of 2400 bit/s. 24 Based on subjective tests, this rather minimal LPC-10 codec has 
low mean opinion score (MOS) but does provide highly intelligible speech connections. LPC- 
10 is part of the PS-1015, a low-rate secure telephony codec standard developed by the U.S. 
Department of Defense in 1984. A later enhancement to LPC-10 is known as the LPC-10(e). 

Compared with the 64 kbit/s PCM or the 32 kbit/s ADPCM waveformcodec,LPC vocoders 
are much more efficient and can achieve speech code rates below 9.6 kbit/s. The 2.4 kbit/s 
LPC-10 example can provide speech digitization at a rate much lower than even the speech 
waveform sampling rate of 8 kHz. The loss of speech quality is a natural trade-off. To better 
understand the difference between waveform vocoders and the model-based vocoders such as 
LPC, we can use the analogy of a food delivery service. Imagine a family living Alaska that 
wishes to order a nice meal from a famous restaurant in New York City. For practical reasons, 
the restaurant would have to send prepared dishes uncooked and frozen; then the family would 
follow the cooking directions. The food would probably taste fine, but the meal would be 
missing the finesse of the original chef. This option is like speech transmission via PCM. The 
receiver has the basic ingredients but must tolerate the quantization etror (manifested by the 
lack of the chefs cooking finesse). To reduce transportation weight, another option is for the 
family to order the critical ingredients only. The heavier but common ingredients (such as rice 
and potatoes) can be acquired locally. This approach is like DPCM or ADPCM, in which only 
the unpredictable part of the voice is transmitted. Finally, the family can simply go online to 

* So-called because it uses order/? = 10. The idea is to allocate two parameters for each possible formant frequency 
peak. 
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order the chef’s recipe. All the ingredients are purchased locally and the cooking is also done 
locally. The Alaskan family can satisfy their gourmet craving without receiving a single food 
item form New York! Clearly, the last scenario captures the idea of model-based vocoders. 
LPC vocoders essentially deliver the recipe (i.e„ the LPC parameters) for voice synthesis at 
the receiver end. 


Practical High-Quality LP Vocoders 

The simple dual-state LPC synthesis of Fig. 6.35 describes no more than the basic idea behind 
model-based voice codecs. The quality of LP vocoders has been greatly improved by a number 
of more elaborate codecs in practice. By adding a few bits, these LP-based vocoders attempt 
to improve the speech quality in two ways: by encoding the residual prediction error and by 
enhancing the excitation signal. 

The most successful methods belong to the class known as code-excited linear prediction 
(CELP) vocoders. CELP vocoders use a codebook, a table of typical LP error {or residue) 
signals, which is set up a priori by designers. At the transmitter, the analyzer compares the 
actual prediction residue to all the entries in the codebook, chooses the entry that is the closest 
match, and just adds the address (code) for that entry to the bits for transmission. The synthesizer 
receives this code, retrieves the corresponding residue from the codebook, and uses it to modify 
the synthesizing output. For CELP to work well, the codebook must be big enough, requiring 
more transmission bits. The FS-1016 vocoder is an improvement over FS-1015 and provides 
good quality, natural-sounding speech at 4.8 kbit/s. 25 More modem variants include the RPE- 
LTP (regular pulse excitation, long-term prediction) LPC codec used in GSM cellular systems, 
the algebraic CELP (ACELP), the relaxed CELP (RCELP), the Qualcomm CELP (QCELP) 
in CDMA cellular phones, and vector-sum excited linear prediction (VSELP), Their data rales 
range from as low as 1.2 kbit/s to 13 kbit/s (full-rate GSM). These vocoders form the basis of 
many modern cellular vocoders, voice over Internet Protocol (VoIP), and other ITU-T G-series 
standards. 


Video Compression 

For video and television to go digital we face a tremendous the challenge. Because of the high 
video bandwidth (approximately 4.2 MHz), use of direct sampling and quantization leads to 
an uncompressed digital video signal of roughly 150 Mbit/s. Thus, the modest compression 
afforded by techniques such as ADPCM and subband coding 2 ^ 127 is insufficient. The key to 
video compression, as it turns out, has to do with human visual perception, 

A great deal of research and development has resulted in methods to drastically reduce the 
digital bandwidth required for video transmission. Early compression techniques compressed 
video signals to approximately 45 Mbit/s (DS3). For the emerging video delivery technolo¬ 
gies of HFC, ADSL, HDTV, and so on, however, much greater compression was required. 
MPEG approached this problem and developed new compression techniques, wdiich provide 
network or VCR quality video at much greater levels of compression. MPEG is a joint effort 
of the International Standards Organizations (ISO), the International Electrotechnical Com¬ 
mittee (IEC), and the American National Standards Institute (ANSI) X3L3 Committee. 28 ' 29 
MPEG has a very informative website that provides extensive information on MPEG and JPEG 
technologies and standards (http://www.mpeg.org/index.html/). MPEG also has an industrial 
forum promoting the organization’s products (http://www.m4if.org/). 

The concept of digital video compression is based on the fact that, on the average, a 
relatively small number of pixels change from frame to frame. Hence, if only the changes 
are transmitted, (he transmission bandwidth can be reduced significantly. Digitizing allows 
the noise-free recovery of analog signals and improves the picture quality at the receiver. 
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Compression reduces the bandwidth required for transmission and the amount of storage for a 
video program and, hence, expands channel capacity. Without compression, a 2-hour digitized 
NTSC video program would require roughly 100 gigabytes of storage, far exceeding the 
capacity of any DVD disc. 

There are three primary MPEG standards in use: 


MPEG-1: Used for VCR-quality video and storage on video CD (or VCD) at a data rate of 1.5 
Mbit/s. These VCDs were quite popular throughout Asia (except Japan). MPEG-1 
decoders are available on most computers. VCD is also a very popular format for 
karaoke. 

MPEG-2: Supports diverse video coding applications for transmissions ranging in quality 
from VCR to high-definition TV (HDTV), depending on data rate. It offers 50:1 
compression of raw video. MPEG-2 is a highly popular format used in DVD, HDTV, 
terrestrial digital video broadcasting (DVB-T), and digital video broadcasting by 
satellite (DVB-S). 

MPEG-4 Provides multimedia (audio, visual, or audiovisual) content streaming over differ¬ 
ent band widths including internet. MPEG-4 is supported by Microsoft Windows 
Media Player, Real Networks, and Apple’s Quicktime and iPod. MPEG-4 recently 
converged with an ITU-T standard known as H.264, to be discussed later. 


The power of video compression is staggering. By comparison, NTSC broadcast television in 
digital form requires 45 to 120 Mbit/s, whereas MPEG-2 requires 1.5 to 15 Mbit/s. On the other 
hand HDTV would require 800 Mbit/s uncompressed which, under MPEG-2 compression, will 
transmit at 19.39 Mbit/s. 

There are two types of MPEG compression, which eliminate redundancies in the 
audiovisual signals that are not perceptible by the listener or the viewer: 

L Video 

■ Temporal or interframe compression by predicting interframe motion and removing 
interframe redundancy 

■ Spatial or intraframe compression, which forms a block identifier for a group of pixels 
having the same characteristics (color, intensity, etc.) for each frame. Only the block 
identifier is transmitted. 

2* Audio, which uses a psychoacoustic model of masking effects. 

The basis for video compression is to remove redundancy in the video signal stream. As an 
example of interframe redundancy, consider Fig. 6.36a and b. In Fig. 6.36a the runner is in 
position A and in Fig. 6.36b he is in position B. Note that the background (cathedral, buildings, 
and bridge) remains essentially unchanged from frame to frame. Figure 6.36c represents the 
nonredundant information for transmission; that is, the change between the two frames. The 
runner image on the left represents the blocks of frame l that are replaced by background 
in frame 2. The runner image on the right represents the blocks of frame 1 that replace the 
background in frame 2. 

Video compression starts with an encoder, which converts the analog video signal from 
the video camera to a digital format on a pixel-by-pixel basis. Each video frame is divided 
into 8x8 pixel blocks, which are analyzed by the encoder to determine which blocks must be 
transmitted, that is, which blocks have significant changes from frame to frame. This process 
takes place in two stages: 
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Figure 6.36 

(□) Frame 1. 

(b) Frame 2. 

|c] Information 
transferred 
between 
frames 1 and 2. 



1* Motion estimation and compensation. Here a motion estimator identifies the areas or groups 
of blocks from a preceding frame that match corresponding areas in the current frame and 
sends the magnitude and direction of the displacement to a predictor in the decoder. The 
frame difference information is called the residual. 

2. Transforming the residual on a block-by-block basis into more compact form. 


The encoded residual signal is transformed into a more compact form by means of a discrete 
cosine transform (DCT) (see Sec* 6.5.2 in Haskel et al., 28 ), which uses a numerical value to 
represent each pixel and normalizes that value for more efficient transmission* The DCT is of 
the form 
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where/( n , m) is the value assigned to the block in the (n, m) position* The inverse transform is 
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The DCT is typically multiplied, for an 8 x 8 block, by the expression C(j)C(k) /4, where 


C(x) = 


-= for x = 0 

V2 

1 otherwise 


Tables 6.2 and 6,3 depict the pixel block values before and after the DCT* One can 
notice from Table 6,3 that there are relatively few meaningful elements, that is, elements 
with significant values relative to the values centered about the 0, 0 position* Because of this, 
most of the matrix values may be assumed to be zero, and, upon inverse transformation, the 
original values are quite accurately reproduced. This process reduces the amount of data that 
must be transmitted greatly, perhaps by a factor of 8 to 10 on the average. Note that the size 
of the transmitted residual may be that of an individual block or, at the other extreme, that of 
the entire picture. 

The transformed matrix values of a block (Table 6*4) are normalized so that most of the 
values in the block matrix are less than 1. Then the resulting normalized matrix is quantized to 
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TABLE 6.2 

8x8 Pixel Block Residual 


n 


158 

158 

158 

163 

161 

161 

162 

162 

157 

157 

1.57 

162 

163 

161 

162 

162 

157 

157 

157 

160 

161 

161 

161 

161 

155 

155 

155 

162 

162 

161 

160 

159 

159 

159 

159 

160 

160 

162 

161 

159 

156 

156 

156 

158 

163 

160 

155 

150 

156 

156 

156 

159 

156 

153 

151 

144 

155 

155 

155 

155 

153 

149 

144 

139 


TABLE 6*3 

Transformed 8x8 Pixel Block Residual DCT Coefficients 


i 

1259,6 

1.0 

-12,1 

5.2 

2.1 

1.7 

-2.7 

-1.3 

22,6 

-17,5 

6,2 

-3.2 

2.9 

-0.1 

-0.4 

-1.2 

-10.9 

9,3 

-1,6 

-1,5 

0.2 

0,9 

-0.6 

0.1 

7.1 

-1.9 

-0.2 

1,5 

-0.9 

-0,1 

0,0 

0.3 

k -0.6 

0.8 

1.5 

-1.6 

-0,1 

0.7 

0,6 

-1.3 

-1.8 

-0.2 

-1.6 

-0.3 

0,8 

1.5 

-1,0 

-1.0 

—1.3 

0,4 

-0.3 

1.5 

-0.5 

-1.7 

1.1 

0.8 

2,6 

1,6 

3,8 

-1.8 

-1.9 

1.2 

0.6 

-0.4 


TABLE 6.4 

Normalized and Quantized 
Residual DCT Coefficients 


in 

21 

0 

-l 

0 

0 

0 

0 

0 

2 

-1 

0 

0 

0 

0 
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0 

-1 

1 

0 

0 

0 

0 

0 

0 

k 0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 


obtain Table 6,4. Normalization is accomplished by a dynamic matrix of multiplicative values, 
which are applied element by element to the transformed matrix. The normalized matrix of 
Table 6.4 is the block information transmitted to the decoder. The denormalized matrix pictured 
in Table 6.5 and the reconstructed (inverse-transformed) residual in Table 6.6 are determined 
by the decoder. The transformation proceeds in a zigzag pattern, as illustrated in Fig. 6.37. 

MPEG approaches the motion estimation and compensation to remove temporal (frame-to- 
frame) redundancy in a unique way. MPEG uses three types of frame, the intraframe or 1-frame 
(sometimes culled the independently coded or intracoded frame), the predicted (predictive) or 
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TABLE 6.5 

Denormalized DCT Coefficients 


i n 

1260 

0 

— 12 

0 

0 

0 

0 

0 

23 

-18 
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0 

0 

0 

0 

0 

-11 

10 

0 

0 

0 

0 

0 

0 

k 0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 


TABLE 6.6 

Inverse DCT Coefficients Reconstructed Residual 


n 


158 

158 

158 

163 

161 

161 

162 

162 

157 

157 

157 

162 

163 

161 

162 

162 

157 

157 

157 

160 

161 

161 

161 

161 

155 

155 

155 
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162 

161 

160 

159 

159 

159 

159 

160 

160 

162 

161 

159 

156 

156 

156 

158 

163 

160 

155 

150 

156 

156 

156 

159 

156 

153 

151 

144 

155 

155 

155 

155 

153 

149 

144 

139 



P-frame, and the bidirectionally predictive frame or B-frame, The P-frames are predicted from 
the I-frames. The B-frames are bidirectionally predicted from either past or future frames. An 
I-fratne and one or more P-frames and B-frames make up the basic MPEG processing pattern, 
called a group of pictures (GOP). Most of the frames in an MPEG compressed image are 
B-frames. The 1-frame provides the initial reference for the frame differences to start the 
MPEG encoding process. Note that the bidirectional aspect of the procedure introduces a 
delay in the transmission of the frames. This is because the GOP is transmitted as a unit and, 
hence, transmission cannot start until the GOP is complete (Fig. 6.38). The details of the 
procedure are beyond the scope of this text. There are many easily accessible books that cover 
this subject in detail. In addition, one may find numerous references to MPEG compression 
and HDTV on the internet. 














6,8 Vocoders and Video Compression 309 


Figure 6.38 

MPEG temporal 
frame structure. 


Bidirectional interpolation 

_t 



Forward prediction 

- Time 


Other Video Compression Standards 

We should mention that in addition to MPEG, there is a parallel attempt by ITU-T to standardize 
video coding. These standards apply similar concepts for video compression. Today, the well- 
known ITU-T video compression standards are the H.26x series, including H.261, H.263, and 
H.264, H.261 was developed for transmission of video at a rate of multiples of 64 kbit/s in 
applications such as videophone and videoconferencing. Similar to MPEG compression, H.261 
uses mot ion-compeu sated temporal prediction. 

H.263 was designed for very low bit rate coding applications, such as videoconferencing. 
It uses block motion-compensated DCT structure for encoding. 30 Based on H.261, H.263 is 
better optimized for coding at low bit rates and achieves much higher efficiency than H,261 
encoding. Flash Video, a highly popular format for video sharing on many web engines such 
as YouTube and My Space, uses a close variant of the H.263 codec called the Sorenson Spark 
codec. 

In fact, H.264 represents a recent convergence between ITU-T and MPEG and is a joint 
effort of the two groups. Also known as MPEG-4 Part 10, H.264 typically outperforms MPEG - 
2 by cutting the data rate nearly in half. This versatile standard supports video applications 
over multiple levels of bandwidth and quality, including, mobile phone service at 50 to 60 
kbit/s, Internet/standard definition video at 1 to 2 Mbit/s, and high-definition video at 5 to 8 
Mbit/s. H.264 is also supported in many other products and applications including iPod, direct 
broadcasting satellite TV, some regional terrestrial digital TV, Mac OS X (Tiger), and Sony's 
Playstation Portable. 


A Note on High-Definition Television (HDTV) 

Utilizing MPEG-2 for video compression, high-definition television (HDTV) is one of the 
advanced television (ATV) functions along with 525-line compressed video for direct broadcast 
satellite (DBS) or cable. The concept of HDTV appeared in the late 1970s. Early development 
work was perfomied primarily in Japan based on an analog system. In the mid-1980s it became 
apparent that the bandwidth requirements of an analog system would be excessive, and work 
began on a digital system that could utilize the 6 MHz bandwidth of NTSC television. In 
the early 1990s seven digital systems were proposed, but testing indicated that none would 
be highly satisfactory. Therefore, in 1993 the FCC suggested the formation of an industrial 
“Grand Alliance” (GA) to develop a common HDTV standard. In December 1997, Standard 
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A/53 for broadcast transmission, proposed by the Advanced Television Systems Committee 
(ATSC), was finalized by the FCC in the United States. 

The GA HDTV standard is based on a 16:9 aspect ratio (motion picture aspect ratio) 
rather than the 4:3 aspect ratio of NTSC television. HDTV uses MPEG-2 compression at 
19.39 Mbit/s and a digital modulation format called 8-VSB (vestigial sideband), which uses 
an eight-amplitude-level symbol to represent 3 bits of information. Transmission is in 207- 
byte blocks, which include 20 parity bytes for Reed-Solomon forward error correction. The 
remaining 187-byte packet format is a subset of the MPEG-2 protocol and includes headers 
for timing, switching, and other transmission control. 

The Advanced Television Systems Group, the successor to the Grand Alliance, has been 
developing standards and recommended practices for HDTV, These are found, along with a 
great deal of other information, on their website: http://www.atsc.org/. 


6.9 MATLAB EXERCISES 

In the MATLAB exercises of this section, we provide examples of signal sampling, signal 
reconstruction from samples, uniform quantization, pulse-coded modulation (PCM), and delta 
modulation (DM). 

Sampling and Reconstruction of Lowpass Signals 

In the sampling example, we first construct a signal g{t ) with two sinusoidal components of 
1-second duration; their frequencies are 1 and 3 Hz. Note, however, that when the signal 
duration is infinite, the bandwidth of g(t) would be 3 Hz. However, the finite duration of the 
signal implies that the actual signal is not band-limited, although most of the signal content 
stays within a bandwidth of 5 Hz. For this reason, we select a sampling frequency of 50 
Hz, much higher than the minimum Nyquist frequency of 6 Hz. The MATLAB program, 
Exsample.m, implements sampling and signal reconstruction. Figure 6.39 illustrates the 
original signal, its uniform samples at the 50 Hz sampling rate, and the frequency response of 
the sampled signal. In accordance with our analysis of Section 6.1, the spectrum of the sampled 
signal gr (/) consists of the original signal spectrum periodically repeated every 50 Hz. 


% (Exsample.m) 

% Example of sampling, quantization, and zero-order hold 
clear;elf; 

td=0.002; %original sampling rate 500 Hz 

t=[0:td:1 *] ; £time interval of 1 second 

xsig=sin(2*pi*t)-sin (6*pi*t) ; % 1 Hz-h3Hz sinusoids 

Lsig-length(xsig); 

ts=0.02; %new sampling rate = 50Hz. 

Nfactor=ts/td; 

% send the signal through a 16-level uniform quantizer 
[s_out,sq_out,sqh_out,Delta,SQNR]=sampandquant(xsig, 16, td,ts); 

% receive 3 signals: 

% 1. sampled signal s_out 

% 2. sampled and quantized signal sq_out 

% 3. sampled, quantized, and zero-order hold signal sqh_out 

% 
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Figure 6.39 
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original signal 
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sampled signal 
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% calculate the Fourier transforms 
Lfft=2*ceil(log2(Lsig)+1); 

Fmax-1/(2*td); 

Faxis=linspace(-Fmax,Fmax,Lfft); 

Xsig=fftshif1f f f1(xsig,Lfft}); 

S_out=f t tshift(fft(s_out,Lf f t)); 

% Examples of sampling and reconstruction using 
% a) ideal impulse train through LFF 

% b) flat top pulse reconstruction through LPF 

% plot the original signal and the sample signals in time 
% and frequency domain 
figure(1); 

subplot(311); sfigla—plot(t,xsig,' k'); 
hold on; sfiglb=plot( t „ S_OUt (1:Lsig) H 'b') ; hold off; 
set(sfigla, 'Linewidth' H 2); set(sfiglb, 'Linewidth',2.) ; 
xlabel('time (sec}'); 

title('Signal {\it g}({\it t}) and its uniform samples'); 
subplot(312); sfiglc=plot(Faxis,abs(Xsig)); 
xlabel('frequency (Hz) ' ) 
axis([-150 150 0 300] ) 

set(sfiglc,'Linewidth',1); title('Spectrum of {\it g}({\it t})'); 
subplot(313); sfigld-plot(Faxis,abs(S_out)); 
xlabel{"frequency (Hz)'); 
axis([-150 150 0 300/Nfactor]) 
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set(sfiglc,'Linewidth',1); title('Spectrum of {\it g}_T({\it t> > ') ; 
% calculate the reconstructed signal from ideal sampling and 
% ideal LPF 

% Maximum LPF bandwidth equals to BW=floor((Lfft/Nfactor)/2}; 
BW=10; ^Bandwidth is no larger than 10Hz. 

H_lpf=zeros(1,Lfft);H_lpf(Lfft/2-BW:Lfft/2+BW-l}=1 j %ideal LPF 
S_recv=Nfactor*S_out.*H_lpf; % ideal filtering 

s_recv=real(ifft[fftshift(S_recv))}; % reconstructed f-domain 

s ^ recv=s — recv (1 : hsig]; % reconstructed t-domain 

% plot the ideally reconstructed signal in time 
% and frequency domain 
figure [2] 

subplot(211); sfig2a=plot(Faxis,abs(S_recv)); 
xlabel('frequency (Hz) ') ; 
axis([-150 150 0 300]); 

title('Spectrum of ideal filtering (reconstruction)'); 
subplot(212 ); sfig2b—plot(t,xsig ( 'k-.' ( t,s_recv(l:Lsig),'b' ) ; 
legend('original signal'reconstructed signal'); 
xlabel{'time (sec)'); 

title ( 1 original signal versus ideally reconstructed signal') ; 
set(sfig2b,'Linewidth' , 2 ) ; 

% non-ideal reconstruction 
ZOH=ones(1,Nfactor); 

s_ni=kron(downsample(s_out,Nfactor),SOH); 

S_ni=fftshift(fft(s_ni,Lfft)); 

s — recv 2=S_ni* *H_lpf; % ideal filtering 

s_rec v 2=real(ifft(fftshift(S_recv2))); % reconstructed f-domain 

s_recv2=s_recv2(l:Lsig); % reconstructed t-domain 

% plot the ideally reconstructed signal in time 
% and frequency domain 
figure(3) 

subplot(211); sfig3a=plot(t,xsig,'b' ,t,s_ni(1:Lsig),' b' ); 
xlabel('time (sec)"); 

title('original signal versus flat-top reconstruction'); 
subplot(212); sfig3b=plot(t,xsig, ' b' , t,s_recv2(1^Lsig), 'b--") ; 
legend( f original signal','LPF reconstruction 1 ); 
xlabel("time (sec )'); 

set(sfig3a,'Linewidth', 2 ) ; set(sfig3b,"Linewidth', 2 ) ; 
title['original and flat-top reconstruction after LPF'); 


To construct the original signal g(t) from the impulse sampling train gr(t), we applied an 
ideal low-pass filter with bandwidth 10 Hz in the frequency domain. This corresponds to the 
interpolation using the ideal sine function as shown in Sec. 6.1*1, The resulting spectrum, as 
shown in Fig* 6.40, is nearly identical to the original message spectrum of g(t). Moreover, the 
time domain signal waveforms are also compared in Fig. 6.40 and show near perfect match* 
In out last exercise in sampling and reconstruction, given in the same program, we use 
a simple rectangular pulse of width T s (sampling period) to reconstruct the original signal 
from the samples (Fig. 6.41 )* A low-pass filter is applied on the rectangular reconstruction 
and also shown in Fig. 6.41. It is clear from comparison to the original source signal that the 
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Figure 6.40 
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recovered signal is still very close to the original signal g(f)- This is because we have chosen 
a high sampling rate such that T p = T$ is so small that the approximation of Eq. (6 + 17) holds. 
Certainly, based on our analysis, by applying the low-pass equalization filter of Eq + (6*16), the 
reconstruction error can be greatly reduced. 

PCM Illustration 

The uniform quantization of an analog signal using L quantization levels can be implemented 
by the MATLAB function uniquan. m. 


% (uniquan.m) 

function [q_out,Delta,SQNR]=uniquan[sig_in,L) 

% Usage 

% [q_out,Delta,SQNR]=uniquan(sig_in,L) 

% L - number of uniform quantization levels 
% sig_in - input signal vector 

% Function outputs: 

% q_out - quantized output 

% Delta - quantization interval 

% SQNR - actual signal to quantization noise ratio 

sig_pmax=max(sig_in); % finding the positive peak 

sig_nmax=min(sig_in ); % finding the negative peak 

Delta=(sig_pmax-sig_mnax)/L; % quantization interval 
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Figure 6*41 

Reconstructed 
signal spectrum 
and waveform 
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reconstruction 
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Followed by LPF 
without 
equalization. 




q_level=sig_nmax+Delta/2:Delta:sig_pmax-Delta/2; % define Q-levels 

L_sig=length (sig_in) ; % find signal length 

sigp=(sig_in-sig_nmax)/Delta+1/2; % convert into 1/2 to h+1/2 range 

qindex=round{sigp); % round to 1, 2, ... L levels 

qindex=min(qindex,L); % eleminate L+l as a rare possibility 

q_out=q_level{qindex); % use index vector to generate output 
SQNR=20*lcgl0(norm{sig_in)/norm(sig_in-q_out)); %actual SQNR value 
end 


The function sampandquant * m executes both sampling and uniform quantization 
simultaneously. The sampling period ts is needed, along with the number L of quantization 
levels, to generate the sampled output s_out ? the sampled and quantized output sq_out, 
and the signal after sampling, quantizing, and zero-order-hold sqh_out. 


% (sampandquant.m) 

function [s_out,sq_out,sqh_out,Delta,SQNR] =sampandquant(sig_in H L,td,ts) 
% Usage 

% [s_out,sq_out,sqh_out,Delta,SQNRJ ^sampandquant(sig_in,L,td,fs} 

% L - number of uniform quantization levels 
% sig_in - input signal vector 

% td - original signal sampling period of sig_in 
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% ts new sampling period 

% NOTE: td*fs must be a positive integer; 

% Function outputs: 

% s_out - sampled output 

% sq_out - sample-and-quantized output 

% sqh_out- sample„quantize,and hold output 

% Delta - quantization interval 

% SQNR - actual signal to quantization noise ratio 

if (rem(ts/td,1)==Q) 
nfac=round(ts/td); 
p_zoh=ones(1,nfac); 
s__out=down sample [sig_in, nf ac ) ; 

[sq_out,Delta,SQNR]=uniquan[s_out r L); 
s_out=upsample(s_out H nfac) ; 
sqh_out=kron(sq_out,p„zoh); 
sq_out=upsample(sq_out,nfac); 
else 

warning('Error 1 ts/td is not an integer!'); 
s_Out= []; sq_Out =[ ];Sqh_Out=[ ]; Delta- [];SQNR=[]; 

end 

end 


The MATLAB program ExPCM.m provides a numerical example that uses these two 
MATLAB functions to generate PCM signals. 


% (ExPCM.m) 

% Example of sampling, quantization, and zero-order hold 
clear;elf; 

td=0.002 ; %original sampling rate 500 Hz 

t=[0:td:1.]; %time interval of 1 second 
xsig=sin(2*pi*t)-sin(6*pi*t); % 1Hz+3Hz sinusoids 
Dsig=length(xsig); 

Lfft=2"ceil(log2(Lsig)+1) ; 

Xsig=fftshift(fft[xsig,Lfft)); 

Fmax-1 / (2*td) ; 

Faxis=linspace(-Fmax H Fmax,Lfft); 

ts = 0- 02; %new sampling rate = 50Hz. 

Nfact=ts/td; 

% send the signal through a 16-level uniform quantizer 
[s_out,sq_out,sqh_outl,Delta,SQNR]= sampandquant(xsig,16,td,ts); 

% obtained the PCM signal which is 

% - sampled, quantized, and zero-order hold signal sqh_out 

% plot the original signal and the PCM signal in time domain 
figure(1); 

subplot(211};sfigl=plot(t,xsig,'k',t,sqh_outl(1:Lsig),'b '); 
set(sfigl,'Linewidlh',2); 

title('Signal {\it g}({\it t}) and its 16 level PCM signal') 
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xlabel{'time (sec.) ') ; 

% send the signal through a 16’level uniform quantiser 
[s_out,sq_out,sqh_out2,Delta,SQNR]=sampandquant(xsig,4,t.d,ts); 

% obtained the PCM signal which is 

% - sampled, quantized, and zero-order hold signal sqh_out 

% plot the original signal and the PCM signal in time domain 
subplot(212);sfig2-plot(t,xsig,'k',t,sqh_out2(1:Lsig),'b '); 
set(sfig2,'Linewidth',2); 

title('Signal {\it g}{{\it t}) and its 4 level PCM signal') 
xlabel('time {sec .)'); 

Lfft=2"'ceil (log2 (Lsig) +1) ; 

Fmax=l/(2*td); 

Faxis=linspace(-Fmax,Fmax,Lfft}; 

SQHl=fftshift(fft(sqh_outl,Lfft )); 

SQH2=fftshift(fft(sqh_out2,Lf f t)); 

% Now use LPF to filter the two PCM signals 
BW=10; ^Bandwidth is no larger than 10Hz. 

H_lpf=zeros(1,Lfft);H_lpf(Lfft/2-BW:Lfft/2+BW-lJ=1; %ideal LPF 
Sl_recv=SQHl .*H_lpf; % ideal filtering 

£_recvl=real(ifft(fftshift(Sl_recv)}); % reconstructed f-domain 

s —recvl = s_recvl(1:Lsig); % reconstructed t-domain 

S2_recv=$QH2 * *H_lpf; % ideal filtering 

s _ recv 2=real(ifft(fftshift(S2_recv))j; % reconstructed f-domain 

s — recv 2^s_recv2(1:Lsig); % reconstructed t-domain 

% Plot the filtered signals against the original signal 
figure(2) 

subplot(211);s fig3=plot[t,xsig, ' b -',t,s_recvl, 'b-. '); 
legend('original', f recovered') 
set(sfig3, 'Linewidth * ,2) ; 

title('Signal {\it g}{{\it t}} and filtered 16-level PCM signal') 
xlabel('time (sec *) '); 

subplot(212);sfig4=plot[t f xsig,'b-' ,t,s_recv2(1:Lsig),'b-.'); 

legend('original','recovered') 
set{sfig4,'Linewidth' ,2 }; 

title('Signal (\it g}({\it t}) and filtered 4-level PCM signal') 
xlabel('time {sec *) '); 


In the first example, we maintain the 50 Hz sampling frequency and utilize L = 16 
uniform quantization levels. The resulting PCM signal is shown in Fig. 6.42. This PCM 
signal can be low-pass-filtered at the receiver and compared against the original message 
signal, as shown in Fig. 6.43* The recovered signal is seen to be very close to the original 
signal g(t). 

To illustrate the effect of quantization, we next apply L — 4 PCM quantization levels. The 
resulting PCM signal is again shown in Fig. 6.42. The corresponding signal recovery is given 
in Fig. 6.43. It is very clear that smaller number of quantization levels (l = 4) leads to much 
larger approximation error. 
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Signal #(/) and its 4-level PCM signal 



Delta Modulation 

Instead of applying PCM, we illustrate the practical effect of step size selection A in the design 
of DM encoder. The basic function to implement DM is given in deltamod .m. 


% (deltamod.m} 

function s_DMout= deltamed(sig_in,Delta,td,ts) 

% Usage 

% s_DMout = deltamed{xsig,Delta,td,ts)) 

% Delta - DM stepsize 

% sig_in - input signal vector 

% td original signal sampling period of sig_in 

% ts new sampling period 

% MOTE: td*fs must be a positive integer; 

% Function outputs: 

% s_DMout - DM sampled output 

if (rem(ts/td,1}==G) 

nfac=round(ts/td); 

p_zoh=ones(1,nfac); 

s_down=downsample(sig_in,nfac}; 

Num_it=length(s_down); 
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Figure 6.43 

Comparison 
between the 
original signal 
and the PCM 
signals after 
low-pass filtering 
to recover the 
original 
message. 



Signal g(t) and filtered 4-PCM signal 



s_DMout(1}=-Delta/2; 
for k=2:fttum_it 

xvar=s_DMout (k-1 J ; 

s_DMout(k)=xvar+Delta*sign(s_down(k-1}-xvar); 

end 

s_DMout=kron(s_DMout,p_zoh ); 
else 

warning{"Error I ts/td is not an integer!'); 
S_DMont=[]; 

end 

end 


To generate DM signals with different step sizes, we apply the same signal g{t) as used 
in the PCM example. The MATLAB program ExDM.rrt applies three step sizes: Ai = 0.2, 
A 2 — 2 A} t and A 3 ~ 4Ai. 


% (ExDM.m) 

% Example of sampling, quantitation, and zero-order hold 
clear;clf; 

td=0 ♦ 002 ; %original sampling rate 500 Hz 
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t=[0:td:l.]; %time interval of 1 second 
xsig=sin(2*pi*t)-sin(6*pi*t); % lHz+3Hz sinusoids 

Lsig-length(xsig); 

ts=0.02; %new sampling rate = 50Hz. 

Nfact=ts/td; 

% send the signal through a 16-level uniform quantizer 
Deltal=0.2; % First select a small Delta=0,2 in DM 

s_DMoutl=deltamod(xsig,Deltal,td,ts); 

% obtained the DM signal 

% plot the original signal and the DM signal in time domain 
figure[1); 

subplot (311) ; sf igl=plot (t, xsig H 'kit, s_DMoutl [ 1: Lsig) , ' b' ) ; 
set(sfigl H r Linewidth',2 ) ; 

title('Signal {\it g}({\it t}) and DM signal') 
xlabel('time (sec, ) ' ) ; axis([0 1 -2.2 2.2]); 

% 

% Apply DM again by doubling the Delta 

Delta2=2*Deltal; % 

s_DMout2=deltamod{xsig,Delta2,td,ts); 

% obtained the DM signal 

% plot the original signal and the DM signal in time domain 
subplot(312);sfig2=plot(t,xsig,'k', t,s_DMout2(1:Lsig), 'b') ; 
set(sfig2,'Linewidth' , 2) ; 

title( 1 Signal {\it g}({\it t}) and DM signal with doubled stepsize') 
xlabel('time (sec,)'); axis([0 1 -2,2 2,2]); 

% 

Delta3=2*Delta2; % Double the DM Delta again, 

s_DMout3=deltamod(xsig,Delta3,td,ts); 

% plot the original signal and the DM signal in time domain 
subplot(313);sfig3=plot(t,xsig,'k',t,s_DMout3(1:Lsig),'b'); 
set(sfig3,'Linewidth',2); 

title('Signal {\it g}({\it t}) and DM signal with quadrupled 
stepsize') 

xlabel{'time (sec,) ') ; axis([0 1 -2.2 2.2]); 


To illustrate the effect of DM, the resulting signals from the DM encoder are shown in 
Fig. 6.44. This example clearly shows that when the step size is too small (Ai), there is a 
severe overloading effect as the original signal varies so fast that the small step size is unable 
to catch up. Doubling the DM step size clearly solves the overloading problem in this example. 
However, quadrupling the step size (A 3 ) would lead to unnecessarily large quantization error. 
This example thus confirms our earlier analysis that a careful selection of the DM step size is 
critical. 
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PROBLEMS 


6.1-1 Figure P6.1 -1 show's Fourier spectra of signals g\ (t) and £2(0- Determine the Nyquist interval 
and the sampling rate for signals gi(t), g 2 (t)> £ 2 (0, g™ (0* and g { (f)g 2 (0- 
Hint: Use the frequency convolution and the width property of the convolution. 



6.1-2 Determine the Nyquist sampling rate and the Nyquist sampling interval for the signals: 

(a) sinc(lOOjrf) 

(b) sine 2 (lOOjn) 

(c) sine (100.Tr) + sine (50.Tr) 

(d) sinc(lOOTr) + 3 sine 2 (60t0 

(e) sinc(50Tr)sinc(100Tr) 

6/1-3 A signal g(t) band-limited to B Hz is sampled by a periodic pulse train pj s (t) made up of a 
rectangular pulse of width 1/8B second (centered at the origin) repeating at the Nyquist rate 
(2 B pulses per second). Show that the sampled signal g(t) is given by 

1 00 2 nit 

g(t) ~ Tg(0 + T] — sin (- 7 -) g(t)cos AnnBt 
4 J nir V 4 / 

n= 1 
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Show that the signal g{t) can be recovered by passing g(f) through an ideal low-pass filter of 
bandwidth B Hz and a gain of 4. 

6*1-4 A signal gU) = sine 2 (5:rf) is sampled (using uniformly spaced impulses) at a rate of (i) 5 Hz; 
(li) 10 Hz; (iii) 20 Hz, For each of the three cases: 

(a) Sketch the sampled signal. 

(b) Sketch the spectrum of the sampled signal. 

(c) Explain whether you can recover the signal g(t) from the sampled signal. 

(d) If the sampled signal is passed through an ideal low-pass filter of bandwidth 5 Hz, sketch the 
spectrum of the output signal. 

6*1-5 Signals gq (f) — 10 4 n(10 4 r) and g 2 (0 = 5(0 are applied at the inputs of ideal low-pass filters 
= n (f/20,000) and if) = n(//10,000) (Fig, P6,F5), The outputs yi (r) andy 2 (0 

of these filters are multiplied to obtain the signal y(f) = yi(0y 2 (0' Find the Nyquist rate of 
yi(0>>'2<0> and y(r)- Use the convolution property and the width property of convolution to 
determine the bandwidth ofyi (0^2 (0- See also Prob. 6.1-1. 



6*1-6 A zero-order hold circuit (Fig. P64-6) is often used to reconstruct a signal g(t) from its samples. 



(a) Find the unit impulse response of this circuit. 

(b) Find the transfer function Hif) and sketch \H{f) |. 

(c) Show that when a sampled signal g(t) is applied at the input of this circuit, the output is a 
staircase approximation of g(t), The sampling interval is T s . 

6*1-7 (a) A first-order hold circuit can also be used to reconstruct a signal g (f) from its samples. The 
impulse response of this circuit is 

»<».*(£) 

where T& is the sampling interval. Consider a typical sampled signal g(f) and show that 
this circuit performs the linear interpolation. In other words, the filter output consists of 
sample tops connected by straight-line segments* Follow the procedure discussed in Sec. 
64,1 (Fig, 6,2b), 

(b) Determine the transfer function of this filter and its amplitude response, and compare it with 
the ideal filter required for signal reconstruction. 
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(c) This filter, being noncausal, is unrealizable. Suggest a modification that will make this filter 
realizable. How would such a modification affect the reconstruction of g(t) from its samples? 
How would it affect the frequency response of the filter? 

6*1-8 Prove that a signal cannot be simultaneously time-limited and band-limited. 

Him: Show that the contrary assumption leads to contradiction. Assume a signal simultaneously 
time-limited and band-limited so that Gif) =0for^| > £, In this case, Gif) = G(f) U(f/2B ; ) 
for B r > B. This means that g(t) is equal to g{r) * 25'sinc (2Show that the latter cannot 
be time-limited, 

6.2- 1 The American Standard Code for Information Interchange (ASCII) has 128 characters, which 

are binary-coded. If a certain computer generates 100,000 characters per second, determine the 
following: 

(a) The number of bits (binary digits) required per character. 

(b) The number of bits per second required to transmit the computer output, and the minimum 
bandwidth required to transmit this signal. 

(c) For single error detection capability, an additional bit (parity bit) is added to the code of each 
character Modify your answers in parts (a) and (b) in view of this information. 

6*2-2 A compact disc (CD) records audio signals digitally by using PCM. Assume that the audio signal 
bandwidth equals 15 kHz. 

(a) If the Nyquist samples are uniformly quantized into L — 65,536 levels and then binary-coded, 
determine the number of binary digits required to encode a sample* 

(b) if the audio signal has average power of 0.1 watt and peak voltage of 1 volt. Find the resulting 
signal-to-quantization-noise ratio (SQNR) of the uniform quantizer output in part (a). 

(c) Determine the number of binary digits per second (bit/s) required to encode the audio signal. 

(d) For practical reasons discussed in the text, signals are sampled at a rate well above the Nyquist 
rate. Practical CDs use 44,100 samples per second. If L = 65,536, determine the number 
of bits per second required to encode the signal, and the minimum bandwidth required to 
transmit the encoded signal. 

6*2-3 A television signal (video and audio) has a bandwidth of 4.5 MHz. This signal is sampled, 
quantized, and binary coded to obtain a PCM signal. 

(a) Determine the sampling rate if the signal is to be sampled at a rate 20% above the Nyquist 
rate* 

(b) If the samples are quantized into 1024 levels, determine the number of binary pulses required 
to encode each sample. 

(c) Determine the binary pulse rate (bits per second) of the binary-coded signal, and the minimum 
bandwidth required to transmit this signal. 

6*2-4 Five telemetry signals, each of bandwidth 240 Hz, are to be transmitted simultaneously by binary 
PCM. The signals must be sampled at least 20% above the Nyquist rate. Framing and synchroniz¬ 
ing requires an additional 0.5% extra bits* A PCM encoder is used to convert these signals before 
they are time-multiplexed into a single data stream. Determine the minimum possible data rate 
(bits per second) that must be transmitted, and the minimum bandwidth required to transmit the 
multiplex signal, 

6.2- 5 It is desired to set up a central station for simultaneous monitoring of the electrocardiograms 

(ECGs) of 10 hospital patients* The data from the lOpatients are brought to aprocessing center over 
wires and are sampled, quantized, binary-coded, and time-division-multiplexed. The multiplexed 
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Figure P,6,2-5 


Figure P.6.2-7 


data are now transmitted to the monitoring station (Fig* P6.2-5). The ECG signal bandwidth 
is 100 Hz* The maximum acceptable error in sample amplitudes is 0,25% of the peak signal 
amplitude. The sampling rate must be at least twice the Nyquist rate. Determine the minimum 
cable bandwidth needed to transmit these data. 


Monitoring 

station 



6*2-6 A message signal m(t) i s transmitted by binary PCM without compression. If the SQNR is required 
to be at least 47 dB> determine the minimum value of L = 2 n required, assuming that m{t) is 
sinusoidal. Determine the actual SQNR obtained with this minimum L. 

6,2-7 Repeat Prob, 6*2-6 for m(i) shown in Fig. P6.2-7. 

Hint: The power of a periodic signal is its energy averaged over one cycle* In this case, however, 
because the signal amplitude takes on the same values every quarter cycle, the power can also be 
found by averaging the signal energy over a quarter cycle. 



6,2-8 For a PCM signal, determine L if the compression parameter ^ = 100 and the minimum SNR 
required is 45 dB. Determine the output SQNR with this value of L. Remember that L must be a 
power of 2, that is, L — 2 n for a binary PCM. 

6*2-9 Asignalband-limitedto 1 MHzissamp]edatarate50%higherthantlieNyquistrateandquantized 
into 256 levels by using a /z-law quantizer with fx = 255, 

(a) Determine the signal-to-quantization-noise ratio. 

(b) The SQNR (the received signal quality) found in part (a) was unsatisfactory. It must be 
increased at least by 10 dB, Would you be able to obtain the desired SQNR without increasing 
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the transmission bandwidth if it was found that a sampling rate 20 % above the Nyquist rate is 
adequate? If so, explain how. What is the maximum SQNR that can be realized in this way? 

6*2-10 The output SQNR of a 10-bit PCM was found to be insufficient at 30 dB. To achieve the desired 
SNR of 42 dB, it was decided to increase the number of quantization levels L. Find the fractional 
increase in the transmission bandwidth required for this increase in L. 

6*4-1 In a certain telemetry system, there are four analog signals mi (f), 012 (f), 0 * 3 ( 0 , and 014 (f), The 
bandwidth of m\(t) is 3,6 kHz, but for each of the remaining signals it is 1,4 kHz. These signals 
are to be sampled at rates no less than their respective Nyquist rates and are to be word-by-word 
multiplexed. This can be achieved by multiplexing the PAM samples of the four signals and 
then binary coding the multiplexed samples (as in the case of the PCM T 1 carrier in Fig, 6.20a). 
Suggest a suitable multiplexing scheme for this purpose. What is the commutator frequency (in 
rotations per second)? Note: In this case you may have to sample some signal(s) at rates higher 
than their Nyquist rate(s). 

6*4-2 Repeat Prob. 6.4-1 if there arc four signals mj (f), 012 (f), 013 ( 0 , and 014(0 with bandwidths 1200, 
700, 300, and 200 Hz, respectively. 

Hint: First multiplex m 2 , m 3 , and m 4 and then multiplex this composite signal with m ] (r)* 

6*4-3 A signal m \ (r) is band-limited to 3.6 kHz, and the three other signals 013 (f), 013 (f), and 014 (f) are 
band-limited to 1,2 kHz each. These signals are sampled at the Nyquist rate and binary coded using 
512 levels (L = 512), Suggest a suitable bit-by-bit multiplexing arrangement (as in Fig. 6.12). 
What is the commutator frequency (in rotations per second), and what is the output bit rate? 

6.7-1 In a single-integration DM system, the voice signal is sampled at a rate of 64 kHz, similar to 
PCM. The maximum signal amplitude is normalized as A max = 1. 

(a) Determine the minimum value of the step size a to avoid slope overload, 

(b) Determine the granular noise power N 0 if the voice signal bandwidth is 3,4 kHz. 

(c) Assuming that the voice signal is sinusoidal, determine S 0 and the SNR, 

(d) Assuming that the voice signal amplitude is uniformly distributed in the range (-1, 1), 
determine S 0 and the SNR. 

(e) Determine the minimum transmission bandwidth. 




"7 PRINCIPLES OF DIGITAL DATA 
/ TRANSMISSION 


T hroughout most of the twentieth century, a significant percentage of communication 
systems was in analog form. However, by the end of the 1990s, the digital format began 
to dominate most applications. One does not need to look hard to witness the continuous 
migration from analog to digital communications: from audiocassette tape to MP3 and CD, 
from NTSC analog TV to digital HDTV, from traditional telephone to VoIP, and from VHS 
videotape to DVD. In fact, even the last analog refuge of broadcast radio is facing a strong 
digital competitor in the form of satellite radio. Given the dominating importance of digital 
communication systems in our lives today, it is never too early to study the basic principles 
and various aspects of digital data transmission, as we will do in this chapter. 

This chapter deals with the problems of transmitting digital data over a channel. Hence, 
the starting messages are assumed to be digital. We shall begin by considering the binary case, 
where the data consist of only two symbols: 1 and 0. We assign a distinct waveform (pulse) 
to each of these two symbols. The resulting sequence of these pulses is transmitted over a 
channel. At the receiver, these pulses are detected and are converted back to binary data (Is 
and Os). 


7.1 DIGITAL COMMUNICATION SYSTEMS 

A digital communication system consists of several components, as shown in Fig. 7.1. In 
this section, we conceptually outline their functionalities in the communication systems. The 
details of their analysis and design will be given in dedicated sections later in this chapter. 


7.1.1 Source 

The input to a digital system takes the form of a sequence of digits. The input could be the 
output from a data set, a computer, or a digitized audio signal (PCM, DM, or LPC), digital 
facsimile or HDTV, or telemetry data, and so on. Although most of the discussion in this chapter 
is confined to the binary case (communication schemes using only two symbols), the more 
general case ofAf-ary communication, which usesAfsymbols, will also be discussed in Secs. 7.7 
and 7.9, 
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Figure 7,1 
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Figure 7,2 

Line code 
examples: 

[a] on-off (RZ); 

[b] polar (RZ]; 

[c] bipolar (RZ); 

[d] on-off (NRZ); 

[e] polar (NRZ], 





7.1.2 Line Coder 

The digital output of a source encoder is converted (or coded) into electrical pulses (wave¬ 
forms) for the purpose of transmission over the channel. This process is called line coding 
or transmission coding. There are many possible ways of assigning waveforms (pulses) to 
the digital data. In the binary case (2 symbols), for example, conceptually the simplest line 
code is on-off, where a 1 is transmitted by a pulse /?(/) and a 0 is transmitted by no pulse 
(zero signal) as shown in Fig. 7,2a. Another commonly used code is polar, where 1 is trans¬ 
mitted by a pulse pit) and 0 is transmitted by a pulse -pit) (Fig* 7.2b). The polar scheme is 
the most power-efficient code because it requires the least power for a given noise immunity 
(error probability). Another popular code in PCM is bipolar, also known as pseudoternary 
or alternate mark inversion (AMI), where 0 is encoded by no pulse and 1 is encoded by 
a pulse pit) or — p{t) depending on whether the previous 1 is encoded by —p{t) or pit). In 
short, pulses representing consecutive Is alternate in sign, as shown in Fig, 7*2c. This code 
has the advantage that if one single error is made in the detecting of pulses, the received pulse 
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sequence will violate the bipolar rule and the error can be detected (although not corrected) 
immediately.* 

Another line code that appeared promising earlier is the duobinary (and modified duobi¬ 
nary) proposed by Lender. 1 ' 2 This code is better than the bipolar in terms of bandwidth 
efficiency. Its more prominent variant, the modified duobinary line code, has seen applica¬ 
tions in hard disk drive read channels, in optical 10 Gbit/s transmission for metronetworks, 
and in the first-generation modems for integrated services digital networks (ISDN). Details of 
duobinary line codes will be discussed later in this chapter. 

In our discussion so far, we have used half-width pulses just for the sake of illustration. We 
can select other widths also. Full-width pulses are often used in some applications. Whenever 
full-width pulses are used, the pulse amplitude is held to a constant value throughout the 
pulse interval (i.e., it does not have a chance to go to zero before the next pulse begins). 
For this reason, these schemes are called non-return-to-zero or NRZ schemes, in contrast 
to return-to-zero or RZ schemes (Fig. 7.2a-c). Figure 7.2d shows an on-off NRZ signal, 
whereas Fig. 7.2e shows a polar NRZ signal. 


7.1.3 Multiplexer 

Generally speaking, the capacity of a physical channel (e.g., coaxial cable, optic fiber) for 
transmitting data is much larger than the data rate of individual sources. To utilize this capac¬ 
ity effectively, we combine several sources by means of a digital multiplexer. The digital 
multiplexing can be achieved through frequency division or time division, as we have already 
discussed. Alternatively, code division is also a practical and effective approach (to be discussed 
in Chapter 11). Thus aphysical channel is normally shared by several messages simultaneously. 


7.1.4 Regenerative Repeater 

Regenerative repeaters are used at regularly spaced intervals along a digital transmission line 
to detect the incoming digital signal and regenerate new “clean” pulses for further transmission 
along the line. This process periodically eliminates, and thereby combats, accumulation of noise 
and signal distortion along the transmission path. The ability of such regenerative repeaters 
to effectively eliminate noise and signal distortion effects is one of the biggest advantages of 
digital communication systems over their analog counterparts. 

If the pulses are transmitted at a rate of R b pulses per second, we require the periodic 
timing information—the clock signal at R b Hz—to sample the incoming pulses at a repeater. 
This timing information can be extracted from the received signal itself if the line code is 
chosen properly. When the RZ polar signal in Fig, 7.2b is rectified, for example, it results in a 
periodic signal of clock frequency R b Hz, which contains the desired periodic timing signal of 
frequency R b Hz. When this signal is applied to a resonant circuit tuned to frequency R b , the 
output, which is a sinusoid of frequency R b Hz, can be used for timing. The on-off signal can 
be expressed as a sum of a periodic signal (of clock frequency) and a polar, or random, signal 
as shown in Fig. 7.3. Because of the presence of the periodic component, we can extract the 
timing information from this signal by using a resonant circuit tuned to the dock frequency. A 
bipolar signal, when rectified, becomes an on-off signal. Hence, its timing information can be 
extracted using the same way as that for an on-off signal. 


* This assumes no more than one error in sequence. Multiple errors in sequence could cancel their respective effects 
and remain undetected. However, the probability of multiple errors is much smaller than that of single errors. Even 
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Figure 7.3 
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The timing signal (the resonant circuit output) is sensitive to the incoming bit pattern. In 
the on-off or bipolar case, a 0 is transmitted by L no pulse/ Hence, if there are too many Os in a 
sequence (no pulses), there is no signal at the input of the resonant circuit and the sinusoidal 
output of the resonant circuit starts decaying, thus causing error in the timing information. We 
shall discuss later ways of overcoming this problem. A line code in which the bit pattern does 
not affect the accuracy of the timing information is said to be a transparent line code. The RZ 
polar scheme (where each bit is transmitted by some pulse) is transparent, whereas the on-off 
and bipolar are nontransparent. 


7.2 LINE CODING 

Digital data can be transmitted by various transmission or line codes. We have given examples 

of on-off, polar, and bipolar. Each line code has its advantages and disadvantages. Among other 

desirable properties, a line code should have the following properties, 

* Transmission bandwidth should be as small as possible. 

* Power efficiency. For a given bandwidth and a specified detection error rate, the transmitted 
power should be as low as possible. 

* Error detection and correction capability. It is desirable to detect, and preferably correct, 
detection errors. In a bipolar case, for example, a single error will cause bipolar violation 
and can easily be detected. Error correcting codes will be discussed in depth in Chapter 14. 

■ Favorable power spectral density. It is desirable to have zero power spectral density (PSD) at 
/ = 0 (dc) because ac coupling and transformers are often used at the repeaters* Significant 
power in low-frequency components should also be avoided because it causes dc wander in 
the pulse stream when ac coupling is used. 


for single errors, we cannot tell exactly where the error is located. Therefore, this code can detect the presence of 
single errors, but it cannot correct them. 

* The ac coupling is required because the dc paths provided by the cable pairs between the repeater sites are used to 
transmit the power needed to operate the repeaters. 
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Adequate timing content. It should be possible to extract timing or dock information from 
the signal 

Transparency. It should be possible to correctly transmit a digital signal regardless of the 
pattern of Is and Os. We saw earlier that a long string of Os could cause problems in timing 
extraction for the on-off and bipolar eases. A code is transparent if the data are so coded that 
for every possible sequence of data, the coded signal is received faithfully. 


7.2.1 PSD of Various Line Codes 

In Example 3.19 we discussed a procedure for finding the PSD of a polar pulse train. We shall 
use a similar procedure to lind a general expression for PSD of the baseband modulation (line 
coding) output signals as shown in Fig. 7.1. In particular, we directly apply the relationship 
between the PSD and the autocorrelation function of the baseband modulation signal given in 
Section 3.8 [Eq. (3.85)]. 

In the following discussion, we consider a generic pulse p(t) whose corresponding Fourier 
transform is P(f ). We can denote the line code symbol at time k as When the transmission 
rate is Rb = 1 jT b pulses per second, the line code generates a pulse train constructed from the 
basic pulse p(t) with amplitude a* starting at time f = £7),; in other words, the kth symbol is 
transmitted as a^p(t — kT b ). Figure 7.4a provides an illustration of a special pulse p(t), whereas 
Fig. 7.4b show's the corresponding pulse train generated by the line coder at baseband. As shown 
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in Fig. 7.4b, counting a succession of symbol transmissions T b second apart, the baseband signal 
is a pulse train of the form 


v(r) = ^2 a kP {t - kT b ) (7.1) 

Note that the line coder determines the symbol {^.} as the amplitude of the pulse p(t — kTb). 

The values a k are random and depend on the line coder input and the line code itself; 
y(t) is a pulse-amplitude-modulated (PAM) signal. The on-off, polar, and bipolar line codes 
are all special cases of this pulse train y{f), where a k takes on values 0, 1, or -1 randomly, 
subject to some constraints. We can, therefore, analyze many line codes according to the PSD 
of y(t). Unfortunately, the PSD of y(t) depends on both a k and p{t). If the pulse shape p(t) 
changes, we may have to derive the PSD all over again. This difficulty can be overcome by the 
simple artifice of selecting a PAM signal x(f) that uses a unit impulse for the basic pulse p(t) 
(Fig. 7.4c). The impulses are at the intervals of T b and the strength (area) of the kth impulse 
is a k . If-r(f) is applied to the input of a filter that has a unit impulse response h(t) = p(/) 
(Fig. 7.4d), the output will be the pulse train y(t) in Fig. 7.4b. Also, applying Eq. (3,92), the 
PSD of y(f) is 


Sy(f) = \P{f)\ 2 S x (f) 

This relationship allows us to determine Sy(f)+ the PSD of a line code corresponding to any 
pulse shape p(t), once we know S x (f). This approach is attractive because of its generality. 

We now need to derive 1Z X (*), the time autocorrelation function of the impulse train jc( 0- 
This can be conveniently done by considering the impulses as a limiting form of the rectangular 
pulses, as shown in Fig. 7.5a. Each pulse has a width e 0, and the kth pulse height 

h k = -* oo 

c 

This way, we guarantee that the strength of the &th impulse is a k , or 


?h k - a k 


If we designate the corresponding rectangular pulse train byi(f), then by definition [Eq, (3.82) 
in Sec. 3.8] 


1 f T/ \. 

1Z x (t) = lim — / — r)dt 

T J-T/2 


(7.2) 


Because 7 ^(t) is an even function of z [Eq. (3.83)], we need to consider only positive i. To 
begin with, consider the case of r < e. In this case the integral in Eq. (7.2) is the area under the 
signal x(t) multiplied by x(t) delayed by z{z < e). As seen from Fig. 7.5b, the area associated 
with the kth pulse is h\{€ — r), and 


ft* 


lim — 

T^o o T 


lim — 

T^co T 




(7.3a) 
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Figure 7.5 
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oo), there are N pulses (N —> oo), where 


During the averaging interval T (T 
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and from Eq. (7.3b) 


= lint — Y' a\ (7.5) 

k 

Observe that the summation is over N pulses. Hence, JRq is the time average of the square of 
the pulse amplitudes a %. Using our time average notation, we can express /?o as 


lim — 

jV->QO N 



(7,6) 


We also know that 7Z%(r) is an even function of z [see Eq, (3.83)j. Hence. Eq. (7.3) can be 
expressed as 




Rq_ 

cTb 


('-*?) 


T < f 


(7.7) 


This is a triangular pulse of height Roj€T b and width 2e centered at r = 0 (Fig. 7.5d). 
This is expected because as r increases beyond e, there is no overlap between the delayed 
signal £(r - t) and x(t)\ hence, TVr(r) = 0, as seen from Fig. 7.5d But as we increase r 
further, we find that the £th pulse of x{i — r) will start overlapping the (k + 1 )th pulse of je(/) 
as r approaches T b (Fig* 7.5c)* Repeating the earlier argument, we see that 7^(r) will have 
another triangular pulse of width 2e centered at r = T b and of height R\feT b where 


*1 = lim ^ Ta k a k + } 

T^o o T — 
k 

it 

= a k a k+l 


Observe that R[ is obtained by multiplying every pulse strength (a*) by the strength of its 
immediate neighbor (fljt+i), adding all these products, and then dividing by the total number 
of pulses. This is clearly the time average (mean) of the product and is, in our notation, 

A similar thing happens around z = 27V 37V — Hence, ^(r) consists of a 
sequence of triangular pulses of width 2e centered at z = 0, d=7V d=27V , * * * The height of 
the pulses centered at ±nT b is R n ftT b , where 

Rfi — ^lim^ ~Y ^ , a k a k+n 

k 

— lim — V cikaic+n 
k 

= dkQfc+fi 

R n is essentially the discrete autocorrelation function of the line code symbols {aft}. 
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To find Tl x { t), we lei e —^ 0 in 7^(7). As e -* 0, the width of each triangular pulse-* 0 
and the height-* oo in such a way that the area is still finite, Thus, In the limit as e -* 0, the 
triangular pulses become impulses. For the «th pulse centered at nT b , the height is 7?„/eT), and 
the area is R n /Tt>. Hence, (Fig. 7.5e) 


n x (r) 
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^2 Rn&te-nTb) 


(7.8) 


The PSD .S'. \f ) is the Fourier transform of 7l x (r). Therefore, 


S,(f) = =- £ Rnt 


-inlirfr,, 


Recognizing that /?_„ — R n [because TZ(z) is an even function of r], we have 


(7-9) 


S x {f) 
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oo 

Rn H- 2 R n cos nlnfTb 
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(7.10) 


The input x(t) to the filter with impulse response h(t) = p(t) results in the output v(f), as 
shown in Fig. 7.4d. Tf p(t) P(f) t the transfer function of the filter is //(/*} = P(f), and 
according to Eq. (3.91), 


Sy(f) = \P(f)\ 2 S x (f) 


|P(f)P 

T b 
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\P(f )\ 2 

T b 


OO 

/?t> + 2 ^ R n cos nln fT b 
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(7.11a) 
(7*1 lb) 


(7*1 lc) 


Thus, the PSD of a line code is fully characterized by its R }1 and the pulse-shaping selection 
Rif)- We shall now use this general result to find the PSDs of various specific line codes by 
first determining the symbol autocorrelation R n . 


7 . 2.2 Polar Signaling 

In polar signaling, 1 is transmitted by a pulse p(t) and 0 is represented by -p{t). In this case, 
a* is equally likely to be 1 or — 1, and af is always T Hence, 

Ro = hm -J- V a\ 
jY—»oo N ^ k 

k 

There are N pulses and ajj; — I for each one, and the summation on the right-hand side above 
isN + Hence, 


Ro 


: lim -(N) = 1 


(7.12a) 
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Moreover, both and^+i are either 1 or —1. Hence, is either 1 or -1. Because the 

pulse amplitude a * is equally likely to be 1 and — 1 on the average, out of N terms the product 
is equal to 1 for N/2 terms and is equal to -1 for the remaining N/2 terms. Therefore, 


Possible Values of 


Ok 

Ok +1 

-1 

+ 1 

-1 

1 

-1 

_ ±1 _, 

-1 
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R i = lim 


■ N 


'N N 

T (i) + I (-i) 


0 (7.12b) 


Arguing this way, we see that the product is also equally likely to be 1 or — 1. Hence, 

/?„ = 0 n> 1 (7.12c) 


Therefore from Eq. (7.11c) 


\P(f)\ 2 

S y (f) = '-^-Rq 
Tb 


\P(D \ 2 

T b 


(7.13) 


For the sake of comparison of various schemes, we shall consider a specific pulse shape. 
Let /?(/) be a rectangular pulse of width 7&/2 (half-width rectangular pulse), that is, 


and 


Therefore 



(7.14) 


(7.15) 


Figure 7.6 shows the spectrum S y (f). It is clear that the polar signal has most of its power con¬ 
centrated in lower frequencies. Theoretically, the spectrum becomes very small as frequency 
increases but never becomes totally zero above a certain frequency. To define a meaningful mea- 
sure of bandwidth, we consider its first non-dc null frequency to be its essential bandwidth.* 
From polar signal spectrum, the essential bandwidth of the signal is seen to be 2R & Hz 
(where is the clock frequency). This is 4 times the theoretical bandwidth (Nyquist band¬ 
width) required to transmit R b pulses per second. Increasing the pulse width reduces the 
bandwidth (expansion in the time domain results in compression in the frequency domain). 


* Strictly speaking, the location of the first null frequency above de is not always a good measure of signal 
bandwidth. Whether the first non-dc null is a meaningful bandwidth depends on the amount of signal power 
contained in the main (or first) lobe of the PSD, as we will see later in the PSD comparison of several line codes 
(Fig, 7.9). In most practical cases, this approximation is acceptable for commonly used line codes and pulse shapes. 
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Figure 7.6 

Power spectral 
density of a 
polar signal. 



For a full-width pulse 5 * (maximum possible pulse width), the essential bandwidth is half, that 
is R b Hz* This is still twice the theoretical bandwidth. Thus, polar signaling is not the most 
bandwidth efficient. 

Second, polar signaling has no capability for error detection or error correction. A third 
disadvantage of polar signaling is that it has nonzero PSD at dc (f — 0). This will rule out the use 
of ac coupling during transmission. The ac mode of coupling, which permits transformers and 
blocking capacitors to aid in impedance matching and bias removal, and allows dc powering 
of the line repeaters over the cable pairs, is very important in practice. Later, we shall show 
how a PSD of a line code may be forced to zero at dc by properly shaping p(t). 

On the positive side, polar signaling is the most efficient scheme from the power require¬ 
ment viewpoint. For a given power, it can be shown that the error detection probability for a 
polar scheme is the lowest among all signaling techniques (see Chapter 10)* Polar signaling is 
also transparent because there is always some pulse (positive or negative) regardless of the bit 
sequence* There is no discrete clock frequency component in the spectrum of the polar signal. 
Rectification of the RZ polar signal, however, yields a periodic signal of clock frequency and 
can readily be used to extract timing. 


7.2.3 Constructing a DC Null in PSD by Pulse Shaping 

Because S y (f ), the PSD of a line code contains a factor \P(f )| 2 , we can force the PSD to have 
a dc null by selecting a pulse p{i) such that P(f) is zero at dc (f = 0)* Because 



p(t)e j2nft dt 


+ Scheme using the full-width pulse p(r) — n (t/T^) is an example of a non-return-to-zero (NRZ) scheme. The 
half-width pulse scheme, on the other hand, is an example of a retum-to-zero (RZ) scheme. 
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Figure 7.7 
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Hence, if the area under p(t) is made zero, /*(()) is zero, and we have a dc null in the PSD. Fot 
a rectangular pulse, one possible shape of pit) to accomplish this is shown in Fig. 7.7a. When 
we use this pulse with polar line coding, the resulting signal is known as Manchester code, or 
split-phase (also called twinned-binary), signal. The reader can use Eq . (7.13), to show that 
for this pulse, the PSD of the Manchester line code has a dc null (see Prob. 7.2-2). 


7.2.4 On-Off Signaling 

In on-off signaling, a 1 is transmitted by a pulse pit) and a 0 is transmitted by no pulse. Hence, 
a pulse strength a * is equally likely to be 1 or 0, Out of N pulses in the interval of T seconds, 
a k is 1 for N/2 pulses and is 0 for the remaining N J2 pulses on the average. Hence, 


*o = 


hm 77 


N 


N 


I'D + 2 (0) 



(7.16) 


To compute R n we need to consider the product Since and a k + n are equally likely 

to be 1 or 0, the product a k ak+n is equally likely to be 1 x 1, 1 x 0, 0 x 1 or 0 x 0, that is, 
1, 0, 0, 0. Therefore on the average, the product a k a k -\-„ is equal to 1 for N/4 terms and 0 for 
3N /4 terms and 
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Therefore, [Hq + (7.9)] 
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(7.18b) 
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Equation (7.18b) is obtained from Eq. (7.18a) by splitting the term 1 /27* corresponding to 
Ro into two: 1 /4T* outside the summation and 1 /47* inside the summation (corresponding to 
n = 0). We now use the formula (see the footnote for a proof*) 


E 


e -j«2njT b 


1 

n 


Substitution of this result in Eq. (7.18b) yields 


Sx{f) 47* + ATl 



(7.19a) 


and the desired PSD of the on-off waveform y(t) is [from Eq. (7.1 la)l 


w = 


\p{f) I 2 

47 * 



(7.19b) 


Note that unlike the continuous PSD spectrum of polar signaling, the on-off PSD of Eq. (7*19b) 
also has an additional discrete part. This discrete part may be nullified if the pulse shape is 
chosen such that 


P 



= 0 


n = 0, ±1, ... 


For the example case of a half-width rectangular pulse [see Eq* (7.14}], 



The resulting PSD is shown in Fig* 7,8. The continuous component of the spectrum is 
(7&/16) sine 2 (jrfTb/2). This is identical (except for a scaling factor) to the spectrum of the 
polar signal [Eq* (7.15)]. The discrete component is represented by the product of an impulse 
train with the continuous component (7^/16) sine 2 ( TzfTt/2 )* Hence this component appears 
as periodic impulses with the continuous component as the envelope. Moreover, the impulses 
repeat at the clock frequency R b = 1 /7i because its fundamental frequency is 2iz/Tb rad/s, or 
1 /?h Hz. This is a logical result because as Fig. 7.3 shows, an on-off signal can be expressed 
as a sum of a polar and a periodic component. The polar component yi (/) is exactly half 


+ The impulse train in Fig. 3.23a of Example 3.11 is = E^L-oq — w7jt,). Moreover, the Fourier series for 
this impulse train as found in Eq. (2.67) is 

E *0-nT b ) = ±. f; R b = ~ 

n=-oo ^ «=-oo ^ 


We take the Fourier transform of both sides of this equation, and use the fact that S(t - nTb) and 

<s=W(f - nR b ). This yields 


OO 

^ e -in2nfT b 
n=-o c 
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Figure 7+8 

Power spectral 
density (PSD] of 
an on-off signal. 



the polar signal discussed earlier. Hence, the PSD of this component is one-fourth the PSD in 
Eq. (7.15). The periodic component is of clock frequency /fy,; it consists of discrete components 
of frequency R^ and its harmonics. 

On-off signaling has very little to brag about. For a given transmitted power, it is less 
immune to noise interference than the polar scheme, which uses a positive pulse for i and a 
negative pulse for 0. This is because the noise immunity depends on the difference of ampli¬ 
tudes representing 1 and 0. Hence, for the same immunity, if on-off signaling uses pulses of 
amplitudes 2 and 0, polar signaling need use only pulses of amplitudes 1 and -1. It is simple 
to show that on-off signaling requires twice as much power as polar signaling. If a pulse of 
amplitude 1 or — 1 has energy E , then the pulse of amplitude 2 has energy (2 ) 2 E = 4E. Because 
1 /T& digits are transmitted per second, polar signal power is (£)(1/T^) = E/T^. For the on-off 
case, on the other hand, each pulse energy is 4E t though on average such a pulse is transmitted 
over half of the time while nothing is transmitted over the other half. Hence, the average signal 
power of on-off is 


n 


H +°'0 


2 E 

n 


which is twice that required for the polar signal. Moreover, unlike the polar case, on-off 
signaling is not transparent. A long string of Os (or offs) causes the absence of a signal and 
can lead to errors in timing extraction. In addition, all the disadvantages of polar signaling, 
(e + g,, excessive transmission bandwidth, nonzero power spectrum at de, no error detection (or 
correction) capability are also present in on-off signaling. 


7.2.5 Bipolar Signaling 

The signaling scheme used in PCM for telephone networks is called bipolar (pseudoternary 
or alternate mark inverted). A 0 is transmitted by no pulse, and a 1 is transmitted by a pulse 
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p(t) or -p(t), depending on whether the previous 1 was transmitted by -p(t) or p(t). With 
consecutive pulses alternating, we can avoid dc wander and thus cause a dc null in the PSD. 
Bipolar signaling actually uses three symbols [/>(.*), 0, and -p(i)], and, hence, it is in reality 
ternary rather than binary signaling. 

To calculate the PSD, we have 
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On the average, half of the a k s are 0, and the remaining half are either 1 or — 1, with aj; = L 
Therefore, 
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To compute R\ , we consider the pulse strength product a k a k+i . There are four equally 
likely sequences of two bits; 11, 10, 01, 00. Since bit 0 is encoded by no pulse ( a k - 0), 
the product a k a k ^\ is zero for the last three of these sequences. This means, on the average, 
that 3N /4 combinations have a k a k +1 = 0 and only N/4 combinations have nonzero a k a k ^i + 
Because of the bipolar rule, the bit sequence 11 can be encoded only by two consecutive 
pulses of opposite polarities. This means the product a k a k +\ = -1 for the N/4 combinations. 
Therefore 
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To compute R 2 in a similar way, we need to observe the product a k a k + 2 - For this ^ we 
to consider all possible combinations of three bits in sequence. There are eight equally likely 
combinations: 111, 101,110,100, Oil, 010, 001, 000. The last six combinations have either 
the first and/or the last bit 0. Hence a k a k + 2 = 0 for all these six combinations. The first two 
combinations are the only ones that yield nonzero a k a k+2 - From the bipolar rule, the first 
and the third pulses in the combination 111 are of the same polarity, yielding a k Gk+2 = 1* 
But for 101, the first and the third pulse are of opposite polarity, yielding a k a k + 2 = — 1. 
Thus, on the average, a* a *+2 = 1 for N/% terms, —1 for N/% terms and 0 for 3N / 4 terms. 
Hence, 


In general 
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R n — lim — 


a k a k+n 


For n > 2, the product a k a k + n can be 1, — 1, or 0. Moreover, an equal number of combinations 
have values 1 and —L This causes R n = (X Thus 


R n = 0 n > 1 
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and [see Eq. (7.1 lc)l 


Sy(f) = 


\P(f)\ 2 
2 n 
I P(f)\ 2 
n 


[l — cos 2irfT b ] 
sin 2 (7TfT b ) 


(7.21a) 

(7.21b) 


Note that S y (f) = 0 for/ = 0 (dc), regardless of P(f). Hence, the PSD has a dc null, which is 
desirable for ac coupling. Moreover, sin 2 (7rfT b ) = 0 at/ = 1/7),, that is, at/ = 1 jT b — R b 
Hz* Thus, regardless of P(f), we are assured of the first non-dc null bandwidth R b Hz. For the 
half-width pulse 


Sy(f) = ^ sine 2 s* n2 (*/Tb) 0-22) 

This is shown in Fig* 7,9* The essential bandwidth of the signal is R b (R b = 1 /T/}, which is 
half that of polar using the same half-width pulse or on-off signaling and twice the theoretical 
minimum bandwidth. Observe that we were able to obtain the bandwidth R b for polar (or 
on-off) case for full-width pulse* For the bipolar case, the bandwidth is R b Hz whether the 
pulse is half-width or full-width. 

Bipolar signaling has several advantages; (1) its spectrum hasadc null; (2) its bandwidth is 
not excessive; (3) it has single-error-detection capability* This is because even single detection 
error will cause a violation of the alternating pulse rule, and this will be immediately detected* 
If a bipolar signal is rectified, we get an on-off signal that has a discrete component at the clock 
frequency. Among the disadvantages of a bipolar signal is the requirement for twice as much 
power (3 dB) as a polar signal needs. This is because bipolar detection is essentially equivalent 
to on-off signaling from the detection point of view. One distinguishes between +/?(f) or — p{t) 
from 0 rather than between ±p(f). 

Another disadvantage of bipolar signaling is that it is not transparent. In practice, various 
substitution schemes are used to prevent long strings of logic zeros from allowing the extracted 
clock signals to drift away. We shall now discuss two such schemes. 
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Figure 7,10 

(□) HDB3 signal 
and (b) its PSD. 


Input digits 

Coded digits 

Transmitted 
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010111000010110100000000001011010100001 

o i o 111jo o_ovji oiio 01011010 i[oTovJi 



High-Density Bipolar (HDB) Signaling 

The HDB scheme is an ITU (formerly CCITT) standard. In this scheme the problem of nontrans¬ 
parency in bipolar signaling is eliminated by adding pulses when the number of consecutive 
0s exceeds N, Such a modified coding is designated as high-density bipolar coding (HDBN), 
where N can take on any value 1, 2, 3, — The most important of the HDB codes is HDB 3 
format, which has been adopted as an international standard. 

The basic idea of the HDBN code is that when a run of JV H-1 zeros occurs, this group of 
zeros is replaced by one of the special N -b 1 binary digit sequences. To increase the timing 
content of the signal, the sequences are chosen to include some binary Is. The is included 
deliberately violate the bipolar rule for easy identification of the substituted sequence. In HDB3 
coding, for example, the special sequences used are 000V and B00V where B=1 that conforms 
to the bipolar rule and V=1 that violates the bipolar rule. The choice of sequence 000V or 
BOOV is made in such a way that consecutive V pulses alternate signs to avoid dc wander 
and to maintain the dc null in the PSD. This requires that the sequence BOOV be used when 
there are an even number of Is following the last special sequence and the sequence 000V be 
used when there are an odd number of Is following the last sequence. Figure 7.10a shows an 
example of this coding. Note that in the sequence BOOV, both B and V are encoded by the 
same pulse. The decoder has to check two things—the bipolar violations and the number of 0s 
preceding each violation to determine if the previous 1 is also a substitution. 

Despite deliberate bipolar violations, HDB signaling retains error detecting capability 
Any single error will insert a spurious bipolar violation (or will delete one of the deliberate 
violations). This will become apparent when, at the next violation, the alternation of viola¬ 
tions does not appear. This also shows that deliberate violations can be detected despite single 
errors. Figure 7.10b shows the PSD of HDB3 as well as that of a bipolar signal to facilitate 
comparison. 3 
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Binary with N Zero Substitution (BNZS) Signaling 

A class of line codes similar to HDBN is the binary with N zero substitution, or BNZS 
code, where if N zeros occur in succession, they are replaced, by one of the two special 
sequences containing some Is to increase timing content. There are deliberate bipolar violations 
just as in HDBN, Binary with eight-zero substitution (BSZS) is used in DS1 signals of the 
digital telephone hierarchy in Chapter 6. It replaces any string of eight zeros in length with a 
sequence of ones and zeros containing two bipolar violations. Such a sequence is unlikely to be 
counterfeited by errors, and any such sequence received by a digital channel bank is replaced 
by a string of eight logic zeros prior to decoding. The sequence used as a replacement consists 
of the pattern OOOVBOVB. Similarly, in B6ZS code used in DS2 signals, a string of six zeros 
is replaced with OVBOVB, and DS3 signal features a three-zero B3ZS code. The B3ZS code 
is slightly more complex than the others in that either BOV or 00V is used, the choice being 
made so that the number of B pulses between consecutive V pulses is odd. These BNZS codes 
with N = 3,6, or 8 involve bipolar violations and must therefore be carefully replaced by their 
equivalent zero strings at the receiver. 

There are many other transmission (line) codes, too numerous to list here. A list of codes 
and appropriate references can be found in Bylanski and Ingram.' 5 


7.3 PULSE SHAPING 

The PSD S y (f) of a digital signal y(t) can be controlled by a choice of line code or by P(f ), 
the pulse shape. In the last section we discussed how the PSD is controlled by a line code. In 
this section we examine how S y (f) is influenced by the pulse shape p(t ), and we learn how to 
shape a pulse p(t) to achieve a desired S y (f). The PSD S y (f) is strongly and directly influenced 
by the pulse shape/?{/) because S y (f) contains the term \P(f)\ 2 . Thus, in comparison to the 
nature of the line code, the pulse shape is a more direct and potent factor in terms of shaping 
the PSD S y (f )> 


7.3.1 Intersymbol Interferences (ISI) and Effect 

In the last section, we used a simple half-width rectangular pulse p(t) for the sake of illustra¬ 
tion. Strictly speaking, in this case the bandwidth of S y (f ) is infinite, since P(f ) has infinite 
bandwidth. But we found that the essential bandwidth of S y (f) was finite. For example, most of 
the power of a bipolar signal is contained within the essential band 0 to R& Hz. Note, however, 
that the PSD is small but is still nonzero in the range / > R^ Hz. Therefore, when such a 
signal is transmitted over a channel of bandwidth R^ Hz, a significant portion of its spectrum 
is transmitted, but a small portion of the spectrum is suppressed. In Sec. 3.5 and Sec. 3.6, we 
saw how such a spectral distortion tends to spread the pulse (dispersion). Spreading of a pulse 
beyond its allotted time interval T b will cause it to interfere with neighboring pulses. This is 
known as intersymbol interference or ISI. 

ISI is not noise. ISI is caused by nonideal channels that are not distortionless over the 
entire signal bandwidth. In the case of half-width rectangular pulse, the signal bandwidth is, 
strictly speaking, infinity. ISI, as a manifestation of channel distortion, can cause errors in 
pulse detection if it is large enough. 

To resolve the difficulty of ISI, let us review briefly our problem. We need to transmit a 
pulse every Tb interval, the Ath pulse being a^p(t — kTh ). The channel has a finite bandwidth, and 
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we are required to detect the pulse amplitude a * correctly (i.e*, without 1ST). In our discussion 
so far, we have considered time-limited pulses. Since such pulses cannot be band-limited, part 
of their spectra is suppressed by a band-limited channel. This causes pulse distortion (spreading 
out) and, consequently, ISI. We can try to resolve this difficulty by using pulses that are band- 
limited to begin with so that they can be transmitted intact over a band-limited channel. But 
band-limited pulses cannot be time-limited* Obviously, various pulses will overlap and cause 
ISI* Thus, whether we begin with time-limited pulses or band-limited pulses, it appears that ISI 
cannot be avoided* It is inherent in the finite transmission bandwidth. Fortunately, there is an 
escape from this blind alley. Pulse amplitudes can be detected correctly despite pulse spreading 
(or overlapping), if there is no ISI at the decision-making instants* This can be accomplished 
by a properly shaped band-limited pulse. To eliminate ISI, Nyquist proposed three different 
criteria for pulse shaping, 4 where the pulses are allowed to overlap. Yet, they are shaped to 
cause zero (or controlled) interference with all the other pulses at the decision-making instants. 
Thus, by limiting the noninterference requirement only at the decision-making instants, we 
eliminate the need for the pulse to be totally non overlapping. We shall consider only the first 
two criteria. The third is much less useful than the first two criteria, 5 and hence, will not be 
considered here* 


7.3.2 Nyquist's First Criterion for Zero ISI 

In the first method, Nyquist achieves zero ISI by choosing a pulse shape that has a nonzero 
amplitude at its center (say t = 0) and zero amplitudes at r = ±nT b (n = 1, 2, 3,...), where 
T b is the separation between successive transmitted pulses (Fig* 7* 11a). Thus, 


1 i = 0 

° ( r ‘ = ^) 


(7.23) 


A pulse satisfying this criterion causes zero ISI at all the remaining pulse centers, or signaling 
instants as shown in Fig* 7*1 la, where we show several successive pulses (dashed) centered at 
f = 0, 77, 277, 377, ... (T b — l/R b ). For the sake of convenience, we have shown all pulses 
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to be positive.* It is clear from this figure that the samples at t — 0, T bt 2T by 3 T by . * * consist 
of the amplitude of only one pulse (centered at the sampling instant) with no interference from 
the remaining pulses. 

Now transmission of R b bit/s requires a theoretical minimum bandwidth R b /2 Hz. It would 
be nice if a pulse satisfying Nyquist’s criterion had this minimum bandwidth R b j 2 Hz. Can we 
find such a pulse />{/)? We have already solved this problem (Example 6.1 with B — R b f 2), 
where we showed that there exists one (and only one) pulse which meets Nyquist's criterion 
(7,23) and has a bandwidth R b j2 Hz. This pulse, p{t) = sine (jt R b t), (Fig. 7.11b) has the 
property 


sine (xRbt) = 


1 t — 0 

0 r = ±nT b 



Moreover, the Fourier transform of this pulse is 



(7.24a) 


(7.24b) 


which has a bandwidth R b f 2 Hz as seen from Fig, 7,11c. We can use this pulse to transmit at 
a rate of R b pulses per second without ISI, over a bandwidth only Rb/ 2. 

This scheme shows that we can attain the theoretical limit of performance by using a 
sine pulse. Unfortunately, this pulse is impractical because it starts at -oo. We will have to 
wait an infinite time to generate it. Any attempt to truncate it would increase its bandwidth 
beyond R b /2 Hz. But even if this pulse were realizable, it would have an undesirable feature: 
namely, it decays too slowly at a rate 1 ft. This causes some serious practical problems. For 
instance, if the nominal data rate of R b bit/s required for this scheme deviates a little, the pulse 
amplitudes will not vanish at the other pulse centers. Because the pulses decay only as 1/7, 
the cumulative interference at any pulse center from all the remaining pulses is of the form 
]T)( 1 fn ). It is well known that the infinite series of this form does not converge and can add up 
to a very large value. A similar result occurs if everything is perfect at the transmitter but the 
sampling rate at the receiver deviates from the rate of R b Hz. Again, the same thing happens if 
the sampling instants deviate a little because of pulse time jitter, which is inevitable even in the 
most sophisticated systems. This scheme therefore fails unless everything is perfect, which is 
a practical impossibility. And all this is because sine (irR b t) decays too slowly (as 1/f). The 
solution is to find a pulse p(t) that satisfies Eq. (7.23) but decays faster than l/r. Nyquist has 
shown that such a pulse requires a bandwidth kR b j2 , with 1 < k < 2. 

This can be proved as follows. Let p(t) P(f) t where the bandwidth of P(f) is in the 
range {R b f 2, Rb) (Fig. 7J2a). The desired pulse p{t) satisfies Eq. (7.23). If we sample p(t) 
every Tb seconds by multiplying p(t) by <5^(0, (an impulse train), then because of the property 
(7.23), all the samples, except the one at the origin, are zero. Thus, the sampled signal pit) is 


Pit) =p(i)5t*(0 = 5(0 (7.25) 

Following the analysis of Eq. (6.4) in Chapter 6, we know that the spectrum of a sampled signal 
p(t) is (l /Tb times) the spectrum of pit) repeating periodically at intervals of the sampling 


+ Actually, a pulse corresponding to 0 would be negative. But considering all positive pulses does not affect our 
reasoning. Showing negative pulses would make the figure needlessly confusing. 
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Figure 7.12 
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frequency R b . Therefore, the Fourier transform of both sides of Eq t (7.25) yields 


-r £ P{f - nR») = 1 


where 



or 


(7^26) 


00 

P{ f - nR b > = T b (7.27) 

Thus, the sum of the spectra formed by repeating P(f) spaced R b apart is a constant T by as 
shown in Fig. 7,12b * 

Consider the spectrum in Fig. 7.12b over the range 0 </ < R b . Over this range only two 
terms P(f) and P(f — R b ) in the summation in Eq. (7.27) are involved. Hence 


P(f)+P(f-R h ) = T b 0 <f<R b 

Letting x—f—R b / 2, we have 


P(x + 0 3Rt) + P (x - 0.5R b ) = T b |jc| < 0 .5R b (7.28a) 

or, alternatively, 

p (■* + §) + P (* “ §) = T *> M < °- 5R b (7.28b) 

Use of the conjugate symmetry property [Eq. (3.11)1 on Eq. (7.28) yields 

p (y + *) + p * (y - *) = J b M < 0.5Rb (7.29) 

* Observe that if R b > 2B, where B is the bandwidth (in hertz} of P(f), the repetitions of P{f) are nonoverlapping, 
and condition (7.27) cannot be satisfied. For R b = 2 the condition is satisfied only for the ideal low-pass 
^(7)^(0 = sine which is not realizable. Henee, we must have B > R b f 2, 
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Figure 7,13 

Vestigial 

(raised-cosine) 

spectrum. 



If we choose Pif) to be real-valued and positive then only !/*(/) | needs to satisfy Eq. (7.29). 
Because l/ 3 ^)! is real, Eq. (7.29) implies 


fiHM'(t-) 


T b 


|x| < 0.5/fy, 


(7.30) 


Hence, |P(/)I should be of the form shown in Fig. 7.13. This curve has an odd symmetry about 
the set of axes intersecting at point a [the point on | J P(/ ) | curve at/ = /?/,/ 2], Note that this 
requires that 


im5/fc)|=0.5|/>(0)| 

The bandwidth, in hertz, of Pif) is 0.5/?/, -f/, where f x is the bandwidth in excess of 
the minimum bandwidth R^f 2. Let r be the ratio of the excess bandwidth/; to the theoretical 
minimum bandwidth /?^/2: 


excess bandwidth 


theoretical minimum bandwidth 
fx 


0.5 R b 

= 2 m 

Observe that because/ cannot be larger than R b /2, 

0 < r < 1 


(7.31) 


(7.32) 


In terms of frequency/, the theoretical minimum bandwidth is R b ( 2 Hz, and the excess 
bandwidth is/ = rR 0 /2 Hz. Therefore, the bandwidth of P(f) is 

n Rb , rR b (l + r)/ffc 

&T = -r - H—r— ■ - 

2 2 


2 


(7.33) 
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Figure 7*14 

Pulses satisfying 
Nyquist's first 
criterion; solid 
curve, ideal 
=0(r = 0); 
light dashed 
curve, 
fx = %/4 
(r = 0,5); heavy 
dashed curve, 

fx=R b I2 

(r=l). 



The constant r is called the roll-off factor and is also expressed in terms of percent. For 
example, if P(f) is a Nyquist first criterion spectrum with a bandwidth that is 50% higher than 
the theoretical minimum, its roll-off factor r = 0.5 or 50%. 

A filter having an amplitude response with the same characteristics is required in the 
vestigial sideband modulation discussed in Sec. 4.5 [Eq. (4*26)]. For this reason, we shall 
refer to the spectrum P(f) in Eqs. (7,29) and (7.30) as a vestigial spectrum. The pulse 
p(t) in Eq. (7.23) has zero ISI at the centers of all other pulses transmitted at a rate of 
R b pulses per second A pulse p(t) that causes zero ISI at the centers of all the remaining 
pulses (or signaling instants) is the Nyquist first criterion pulse. We have shown that a pulse 
with a vestigial spectrum [Eq. (7.29) or Eq. (7.30)1 satisfies the Nyquist’s first criterion for 
zero ISI. 

Because 0 < r < 1, the bandwidth of P(f ) is restricted to the range R b /2 to R b Hz. The 
pulse p{t) can be generated as a unit impulse response of a filter with transfer function P(f ). 
But because P(f) = 0 over a frequency band, it violates the Paley-Wiener criterion and is 
therefore unrealizable. However, the vestigial roll-off characteristic is gradual, and it can be 
more closely approximated by a practical filter. One family of spectra that satisfies Nyquist’s 
first criterion is 



(7.34) 


Figure 7.14a shows three curves from this family, corresponding to f x = 0 (r — 0), 
,fx=Rbf4- (r = 0.5) and f x = R b /2 (r=l). The respective impulse responses are shown in 
Fig. 7.14b. It can be seen that increasing f x (or r) improves p(t) \ that is, more gradual cutoff 
reduces the oscillatory nature of p(t) and causes it to decay more rapidly in time domain. For 
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the case of the maximum value of f x = Rb/'l (r = 1), Eq. (7.34) reduces to 

W -I ( i + co.^ t) n(i) 

—*(=?)"(?) 

This characteristic of Eq. (7.34) is known in the literature as the raised-cosine characteristic, 
because it represents a cosine raised by its peak amplitude* Eq* (7,35) is also known as the 
full-cosine roll-off characteristic. The inverse Fourier transform of this spectrum is readily 
found as (see Prob 7*3-8) 


(7*35a) 

(7.35b) 


„ cos nRht , _ v 

P (0 = R b- - —22 S1I1C (* R b0 (7.36) 

1 - 4Rp 2 

This pulse is shown in Fig. 7.14b (r ~ 1). We can make several important observations about 
the raised-cosine pulse* First, the bandwidth of this pulse is Rb Hz and has a value R t) at t = 0 
and is zero not only at all the remaining signaling instants but also at points midway between 
all the signaling instants* Second, it decays rapidly, as {ft 3 . As a result, the raised-cosine pulse 
is relatively insensitive to deviations of Rb, sampling rate, timing jitter, and so on. Furthermore, 
the pulse-generating filter with transfer function P(f) [Eq* (7.35b)] is closely realizable. The 
phase characteristic that goes along with this filter is very nearly linear, so that no additional 
phase equalization is needed* 

It should be remembered that it is the pulses received at the detector input that should 
have the form for zero ISI. In practice, because the channel is not ideal (distortionless), the 
transmitted pulses should be shaped so that after passing through the channel with transfer 
function H c (f), they will be received with the proper shape (such as raised-cosine pulses) at 
the receiver. Hence, the transmitted pulse pi(t) should satisfy 


Pi(f)H c (f)=P(f) 


where P(f) has the vestigial spectrum in Eq. (7*30). For convenience, the transfer function 
H c (f) as a channel may also include a receiver filter designed to reject interference and other 
out-of-band noises* 


Example 7.1 Determine the pulse transmission rate in terms of the transmission bandwidth Bj and the 
roll-off factor r. Assume a scheme using NyquisFs first criterion* 

| From Eq. (7.33) 


Because 0 < r < 1, the pulse transmission rate varies from 2 Br to Bj , depending on the 
choice of r* A smaller r gives a higher signaling rate. But the pulse p{t) decays slowly, 
creating the same problems as those discussed for the sine pulse* For the raised-cosine 
pulse r = 1 and Rb = Bj * we achieve half the theoretical maximum rate. But the pulse 
decays faster as 1 ft 3 and is less vulnerable to ISI. 
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7.3.3 Controlled ISI or Partial Response Signaling 

The Nyquist criterion pulse requires in a bandwidth somewhat larger than the theoretical 
minimum. If we wish to further reduce the pulse bandwidth, we must find a way to widen the 
pulse p(t ) (the wider the pulse, the narrower the bandwidth). Widening the pulse may result in 
interference (ISI) with the neighboring pulses. However, in the binary transmission with just 
two possible symbols, a known and controlled amount of ISI may be possible to remove or 
compensate because there are only a few possible interference patterns. 

Consider a pulse specified by (see Fig. 7.15): 


p(nT b ) = 


n = 0, 1 
for all other n 


(7.37) 


This leads to a known and controlled ISI from the £th pulse to the very next transmitted 
pulse. We use polar signaling by means of this pulse. Thus, 1 is transmitted by p(t) and 0 is 
transmitted by using the pulse — p(t). The received signal is sampled at t — nT b , and the pulse 
p(t) has zero value at all n except for n — 0 and I, where its value is 1 (Fig. 7.15). Clearly, 
such a pulse causes zero ISI with all the pulses except the succeeding pulse. Therefore, we 
need to worry about the ISI with the succeeding pulse only. Consider two such successive 
pulses located at 0 and T b > respectively. If both pulses were positive, the sample value of the 
resulting signal at / = T b would be 2. If the both pulses were negative, the sample value would 
be -2. But if the two pulses were of opposite polarity, the sample value would be 0. With 
only these three possible values, the signal sample clearly allows us to make correct decision 
at the sampling instants. The decision rule is as follows. If the sample value is positive, the 
present bit is 1 and the previous bit is also 1. If the sample value is negative, the present 
bit is 0 and the previous bit is also 0. If the sample value is zero, the present bit is the 
opposite of the previous bit. Knowledge of the previous bit then allows the determination of the 
present bit. 

Table 7.1 shows a transmitted bit sequence, the sample values of the received signal jc( 0 
(assuming no errors causes by channel noise), and the detector decision. This example also 
indicates the error detecting property of this scheme. Examination of samples of the waveform 
y{t) in Table 7.1 shows that there are always an even number of zero-valued samples between 
two full-valued samples of the same polarity and an odd number of zero-valued samples 
between two full-valued samples of opposite polarity. Thus, the first sample value of jr(t) is 2, 
and the next full-valued sample (the fourth sample) is 2. Between these full-valued samples 
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TABU 7.1 

Transmitted Bits and the Received Samples in Controlled ISI Signaling 

Information sequence 1101 I 0 0 0 1 0 l 1 1 

Samples y(kT b ) 1 20020 -2 —2 000 22 

Detected sequence 1101 1000101 1 1 


of the same polarity, there are an even number (i.e., 2) of zero-valued samples. If one of the 
sample values is detected wrong, this rule is violated, and the error is detected. 

The pulse pit) goes to zero at t = -T b and 2 T b , resulting in the pulse width {of the 
primary lobe) 50% higher than that of the first criterion pulse. This pulse broadening in the 
time domain leads to reduction of its bandwidth. This is the second criterion proposed by 
Nyquist. This scheme of controlled ISI is also known as correlative or partial-response 
scheme. A pulse satisfying the second criterion in Eq. (7.37) is also known as the duobinary 
pulse. 


7.3.4 Example of a Duobinary Pulse 

If we restrict the pulse bandwidth to R b f2 y then following the procedure of Example 7T, we 
can show that (see Prob 7.3-9) only the following pulse p(t) meets the requirement in Eq. (7.37) 
for the duobinary pulse: 


pit) 


sin (nR b t) 


nR h t{\-R h t) 

The Fourier transform P(f) of the pulse p(t) is given by (see Prob 7.3-9) 


p ( ft = ! cos ( f ) n (0 


iRh 


(7.38) 


(7.39) 


The pulse pit) and its amplitude spectrum | J P(/')| are shown in Fig. 7,167 This pulse transmits 
binary data at a rate of R b bit/s and has the theoretical minimum bandwidth R b /2 Hz, Equation 
(7.38) shows that this pulse decays rapidly with time as 1 ft 2 . This pulse is not ideally realizable 
because pit) is noncausal and has infinite duration [because P(f) is band-limited]. However, 
it decays rapidly (as l ft 2 ), and therefore can be closely approximated. 

It may come as a surprise that we are able to achieve the theoretical rate using the duobinary 
pulse. In fact, it is an illusion. The theoretical rate of transmission is 2 pieces of independent 
information per second per hertz bandwidth. We have achieved this rate fm binary information. 
Here is the catch! A piece of binary information does not qualify as an independent piece of 
information because it cannot take on an arbitrary value. It must be selected from a finite set. 
The duobinary pulse would fail if the pulses were truly independent pieces of information, 
that is, if the pulses were to have arbitrary amplitudes. The scheme works only because the 
binary pulses take on finite known values, and hence, there are only a finite (known) number of 
interference patterns between pulses, which permits correct determination of pulse amplitudes 
despite interference. 


The phase spectrum is linear with &p(f) — 
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|P(/)I 



7.3.5 Pulse Relationship between Zero-ISI, 

Duobinary, and Modified Duobinary 

Now we can establish the simple relationship between a pulse p a (t) satisfying the first Nyquist 
criterion (zero IS1) and a duobinary pulse p h (t) (with controlled ISI). From Eqs, (7.23) and 
(7.37), it is clear thatp a (A:7/,) and Pb(kT b ) only differ for k = L They have identical sample 
values for all other integer k. Therefore, one can easily construct a pulse p b {t) from p a {t) by 

Pb(t) =pAt)+p G {t - T b ) 

This addition is the “controlled" ISI or partial-response signaling that we deliberately intro¬ 
duced to reduce the bandwidth requirement. To see what effect “duobinary” signaling has on 
the spectral bandwidth, consider the relationship of the two pulses in the frequency domain: 

Pb(f) = /V/)[l + e~^ T ”] (7.40a) 

\Pb(f }\ = l/V/ilv^a +COS (2nfTb)2 |cos (nfT h )\ (7.40b) 

We can see (hat partial-response signaling is actually forcing a frequency null at 2~fT * — t or, 
equivalently/ — 0.5/7"*. Therefore, conceptually we can see how partial-response signaling 
provides an additional opportunity to reshape the PSD or the transmission bandwidth. Indeed, 
duobinary signaling, by forcing a frequency null at 0.5/7"*, forces its essential bandwidth to 
be at the minimum transmission bandwidth needed for a data rate of 1/7* (as discussed in 
Sec. 6.1.3). 
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In fact, many physical channels such as magnetic recording have a zero gain at dc* There¬ 
fore, it makes no sense for the baseband signal to have any dc component in its PSD* Modified 
partial-response signaling is often adopted to force a null at dc* One notable example is the 
so-called modified duobinary signaling that requires 


p c {nT h ) = 


1 n = -1 

— 1 n — 1 

0 for all other integers n 


(7.41) 


A similar argument indicates that p c (t) can be generated from any pulse p a (t) satisfying the 
first Nyquist criterion via 


PciO —PaU "b PaU 


Equivalently, in the frequency domain, the duobinary pulse is 

P c tf) = 2jP Q (f )sm (2izfT b ) 

which uses sin (2 nfT b ) to force a null at dc to comply with the physical channel constraint. 


7.3.6 Detection of Duobinary Signaling 
and Differential Encoding 

For the controlled 1ST method of duobinary signaling, Fig. 7*17 shows the basic transmitter 
diagram. We now take a closer look at the relationship of all the data symbols at the baseband 
and the detection procedure* For binary message bit 4 = 0, or 1, the polar symbols are simply 

Qk = 24 ■“ 1 

Under the controlled ISJ, the samples of the transmission signal y(f) are 

y(kT b ) ^b k =a k +a k _ x (7*42) 

The question for the receiver is how to detect I k from y{kT b ) or b k . This question can be 
answered by first considering all the possible values of b k or y(kT b ). Because a k = ±1, then 
b k = 0, ±2* From Eq. (7*42), it is evident that 

b k — 2 => a k — 1 or 4 = 1 

b k — — 2 => a k — — 1 or 4 = 0 (7.43) 

b k = 0 ™ —&k -1 °r 4 — 1 — 4—1 


Figure 7.17 

Equivalent 

duobinary 

signaling* 
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Therefore, a simple detector of duobinary signaling is to first detect all the bits I k corresponding 
to b k = ±2. The remaining {&*} are zero-valued samples that imply transition: that is, the 
current digit is 1 and the previous digit is 0, or vice versa. This means the digit detection must 
be is based on the previous digit. An example of this digit-by-digit detection was shown in 
Table 7.1. The disadvantage of the detection method in Eq. (7.43) is that when y(kT b ) = 0, the 
current bit decision depends on the previous bit decision. If the previous digit were detected 
incorrectly, then the error would tend to propagate, until a sample value of ±2 appears. To 
mitigate this error propagation problem, we apply a effective mechanism known as differential 
coding. 

Figure 7.18 illustrates a duobinary signal generator by introducing an additional differ¬ 
ential encoder prior to partial-response pulse generation. As shown in Fig. 7.18, differential 
encoding is a very simple step that changes the relationship between line code and the message 
bits. Differential encoding generates a new binary sequence 

pk = 4 ® Pk -1 modulo 2 

with the assumption that the precoder initial state is either po = 0 or /jq — l. Now, the precoder 
output enters a polar line coder and generates 


a* = 2 Pk ~ 1 

Because of the duobinary signaling b k = a k -\- a k _\ and the zero-TSI pulse generator, the 
samples of the received signal y(t) without noise become 

y{kT h ) =b k = a k -ha*-j 

= Kpk H-Pa-i) - 2 
= 2 (Pk- 1 ® 4 +Pk-l “ 1 ) 

= 2 ( 1 - 4 ) »-, = ! 

2 ( 4 - 1 ) =0 

Based on Eq. (7.44), we can summarize the direct relationship between the message bits and 
the sample values as 


y{kT b ) = 


0 

±2 


4 = 1 
4=0 


(7.45) 


This relationship serves as our basis for a symbol-by-symbol detection algorithm. In short, the 
decision algorithm is based on the current sample y(kT b ). When there is no noise, y(kT b ) = b k 
and the receiver decision is 


4 = 


2-1 y{kT b )\ 
2 


(7.46) 


Figure 7,18 
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encoded 

duobinary 

signaling. 




7,4 Scrambling 355 


TABLE 7,2 

Binary Duobinary Signaling with Differential Encoding 


Time k 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

h 


\ 

1 

0 

1 

l 

0 

0 

0 

1 

0 

1 

1 

1 

Pk 

0 

I 

0 

0 

1 

0 

0 

0 

0 

1 

1 

0 

1 

0 

a k 

-1 

1 

-1 

-I 

1 

-1 

-l 

-l 

-1 

1 

1 

-1 

1 

-1 

h 


0 

0 

_2 

0 

0 

-2 

-2 

-2 

0 

2 

0 

0 

0 

Detected bits 


1 

1 

0 

1 

1 

0 

0 

0 

1 

0 

1 

1 

1 


Therefore, the incorporation of differential encoding with duobinary signaling not only sim¬ 
plifies the decision rule but also makes the decision independent of the previous digit and 
eliminates error propagation. In Table 7.2, the example of Table 7.1 is recalculated with 
differential encoding. The decoding relationship of Eq. (7.45) is clearly shown in this example. 

The differential encoding defined for binary information symbols can be conveniently 
generalized to nonbinary symbols. When the information symbols 4 are M -ary, the only change 
to the differential encoding block is to replace “modulo 2” with “modulo M.” Similarly, other 
generalized partial-response signaling such as the modified duobinary must also face the error 
propagation problem at its detection. A suitable type of differential encoding can be similarly 
adopted to prevent erroT propagation. 


7.3.7 Pulse Generation 

A pulse p(t) satisfying a Nyquist criterion can be generated as the unit impulse response of a 
filter with transfer function P(f). This will not always be easy. Abetter method is to generate 
the waveform directly, using a transversal filter (tapped delay line) discussed here. The pulse 
p(t) to be generated is sampled with a sufficiently small sampling interval T s (Fig. 7.19a), 
and the filter tap gains are set in proportion to these sample values in sequence, as shown 
in Fig. 7.19b. When a narrow rectangular pulse with the width T Sf the sampling interval, is 
applied at the input of the transversal filter, the output will be a staircase approximation of 
p(t). This output, when passed through a low-pass filter, is smoothed out. The approximation 
can be improved by reducing the pulse sampling interval 4. 

It should be stressed once again that the pulses arriving at the detector input of the receiver 
need to meet the desired Nyquist criterion. Hence, the transmitted pulses should be so shaped 
that after passing through the channel, they are received in the desired (Nyquist) form. In 
practice, however, pulses need not be shaped rigidly at the transmitter. The final shaping can 
be carried out by an equalizer at the receiver, as discussed later (Sec, 7.5). 


7.4 SCRAMBLING 

In general, a scrambler tends to make the data more random by removing long strings of 
Is or Os. Scrambling can be helpful in timing extraction by removing long strings of Os in 
binary data. Scramblers, however, are primarily used for preventing unauthorized access to 
the data, and they are optimized for that purpose. Such optimization may actually result in 
generation of a long string of zeros in the data. The digital network must be able to cope with 
these long zero strings by using the zero replacement techniques discussed in Sec. 7,2. 




Smoothing 

filter 
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Figure 7.20 shows a typical scrambler and descrambler. The scrambler consists of a feed¬ 
back shift register, and the matching descrambler has a feedforward shift register, as shown 
in Fig. 7.20. Each stage in the shift register delays a bit by one unit. To analyze the scrambler 
and the matched descrambler, consider the output sequence T of the scrambler (Fig. 7.20a). If 
S is the input sequence to the scrambler, then 

S <&D 2 T @D S T = T (7.47) 

where D represents the delay operator: that is, D^T is the sequence T delayed by n units. Now, 
recall that the modulo 2 sum of any sequence with itself gives a sequence of all Os. Adding 
( D 3 © D 5 )T to both sides of Eq. (7,47), we get 

S = T®(D 2 ®D 5 )T 
= [1 ® (D 2 ® D 5 )]T 

= (1 ®F)T (7.48) 

where F = D 2 © D 5 . 

To design the descrambler at the receiver, we start with 7\ the sequence received at the 
descrambler. From Eq. (7.48), it follows that 

T © FT = T © (/J> 3 ® D 5 )T = S 

This equation, in which we regenerate the input sequence 5 from the received sequence 7\ is 
readily implemented by the descrambler shown in Fig. 7.20b. 

Note that a single detection error in the received sequence T will affect three output bits 
in R . Hence, scrambling has the disadvantage of causing multiple errors for a single received 
bit error. 

Example 7.2 The data stream 101010100000111 is fed to the scrambler in Fig. 7.20a. Find the scrambler 
output 7\ assuming the initial content of the registers to be zero. 

From Fig. 7.20a we observe that initially T — S, and the sequence S enters the register 
and is returned as (Z) 3 © D 5 )S = FS through the feedback path. This new sequence FS 
again enters the register and is returned as F 2 S , and so on. Hence 

7 1 = 5 © FS © © F 3 S © < ♦ ♦ 

= (1 0 F © F 2 © F 3 © - ■ ■ )S (7.49) 

Recognizing that 

F = D 3 © D 5 

we have 

F 1 = (D 2 ® D 5 )(D 2 © Z> 5 ) = D 6 © £> 10 © D s © £> 8 
Because modulo-2 addition of any sequence with itself is zero, £> 8 © Z) 8 = 0, and 

F 2 = D 6 ® £> 10 






358 PRINCIPLES OF DIGITAL DATA TRANSMISSION 


Similarly 


F 3 = (D 6 © D W )(D 3 © D 5 ) = D 9 © D ] 1 © D 13 ® Z> 15 

and so on. Hence [see Eq. (7.49)], 

T = (l ©Z> 3 @D 5 0D 6 0D 9 ® D l0 ®D n ©Z) 12 ©Z) 13 ®Z> 15 

Because D n S is simply the sequence S delayed by n bits, various terms in the above 
equation correspond to the following sequences; 

£ = 101010100000111 
D 3 S = 000101010100000111 
D 5 S = 00000101010100000111 
D 6 S = 000000101010100000111 
D y S = 000000000101010100000111 
D l0 S = 0000000000101010100000111 
D U S = 00000000000101010100000111 
D n S = 000000000000101010100000111 
D u s = 0000000000000101010100000111 
P 15 S = 000000000000000 101010100000111 
T = 101110001101001 

Note that the input sequence contains the periodic sequence 10101010 ■ ■ ■ ,as well as a 
long string of 0s, The scrambler output effectively removes the periodic component, as 
well as the long string of 0s. The input sequence has 15 digits. The scrambler output up 
to the 15th digit only is shown, because all the output digits beyond 15 depend on input 
digits beyond 15, which are not given. 

Readers can verify that the descrambler output is indeed S when the foregoing 
sequence T is applied at its input. 


7.5 DIGITAL RECEIVERS AND 
REGENERATIVE REPEATERS 


Basically, a receiver or aregenerative repeater performs three functions: (1) reshaping incoming 
pulses by means of an equalizer, (2) extracting the timing information required to sample 
incoming pulses at optimum instants, and (3) making symbol detection decisions based on the 
pulse samples. The repeater shown in Fig. 7.21 consists of a receiver plus a “regenerator” A 
complete repeater may also include provision for separation of dc power from ac signals. This 






7.5 Digital Receivers and Regenerative Repeaters 359 


Figure 7.31 

Regenerative 

repeater 



is normally accomplished by transformer-coupling the signals and bypassing the dc around 
the transformers to the power supply circuitry.* 


7.5.1 Equalizers 

A pulse train is attenuated and distorted by the transmission medium. The attenuation can 
be compensated by the preamplifier, whereas the distortion is compensated by the equalizer, 
Channel distortion is in the form of dispersion, which is caused by an attenuation of certain 
critical frequency components of the data pulse train. Theoretically, an equalizer should have a 
frequency characteristic that is the inverse of that of the transmission medium. This will restore 
the critical frequency components and eliminate pulse dispersion. Unfortunately, this also 
enhances the received channel noise by boosting its components at these critical frequencies. 
This undesirable phenomenon is known as noise amplification ♦ 

For digital signals, however, complete equalization is really not necessary, because a 
detector only needs to make relatively simple decisions—such as whether the pulse is positive 
or negative (or whether the pulse is present or absent). Therefore, considerable pulse dispersion 
can be tolerated. Pulse dispersion results in ISI and the consequent increase in error detection. 
Noise increase resulting from the equalizer (which boosts the high frequencies) also increases 
the detection error probability. For this reason, design of an optimum equalizer involves an 
inevitable compromise between reducing ISI and reducing the channel noise. Ajudicious choice 
of the equalization characteristics is acentral feature in all well-designed digital communication 
systems. 6 

Zero-Forcing Equalizer 

It is really not necessary to eliminate or minimize ISI (interference) with neighboring pulses 
for all f. All that is needed is to eliminate or minimize interference with neighboring pulses 
at their respective sampling instants only. This is because the receiver decision is based on 
sample values only. This kind of (relaxed) equalization can be accomplished by equalizers 
using the transversal filter structure encountered earlier. Unlike traditional filters, transversal 


* The repeater usually includes circuitry to protect the electronics of the regenerator from high-voltage transients 
induced by power surges and lightning. Special transformer windings may be provided to couple fault-locate signals 
into a cable pair dedicated to the purpose. 
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Figure 7,22 

Zero-forcing 

equalizer 

analysis. 
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filter equalizers are easily adjustable to compensate against different channels or even slowly 
time-varying channels. The design goal is to force the equalizer output pulse to have zero ISI 
values at the sampling (decision-making) instants. In other words, the equalizer output pulses 
satisfy the Nyquist or the controlled ISI criterion. The time delay T between successive taps 
is chosen to be T' bf the interval between pulses. 

To begin, set the tap gains cq = 1 and cjt = 0 for all other values of k in the transversal filter 
in Fig. 7.22a. Thus the output of the filter will be the same as the input delayed by AT&, For a 
single pulse p r (t) (Fig. 7.22b) at the input of the transversal filter with the tap setting just given, 
the filter output p 0 (t) will be exactly p r (t - AT&), that is, p,(t) delayed by NT b . This delay has 
no practical effect on our communication system and is not relevant to our discussion. Hence, 
for convenience, we shall ignore this delay. This means that p r (t) in Fig. 7.22b also represents 
the filter output p 0 (t) for this tap setting (c 0 = I and q = 0, k ^ 0). We require that the 
output pulse p 0 {t) satisfy the Nyquist’s criterion or the controlled ISI criterion, as the case may 
be* For the Nyquist criterion, the output pulse p 0 (t) must have zero values at all the multiples 
ofT b . From Fig. 7*22b, we see that the pulse amplitudes a\, &-u and ^2 at Tjc,, -T bt and 2T b , 
respectively, are not negligible. By adjusting the tap gains (c*)» we generate additional shifted 
pulses of proper amplitudes that will force the resulting output pulse to have desired values at 
t = 0, ±T b , ±27^, 

The output p 0 {t) (Fig 7*22c) is the sum of pulses of the form CkpAt - kT b ) (ignoring the 
delay of NT b ), Thus 


N 

Po(0 — /* ' c nPr(t — 
n——N 


(7*50) 
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The samples of p 0 (t) at f = kT b are 


N 

Po(kTb) = c n p r (kT b — nT b ) k = 0, ±1, ±2, ±3,,, * (7.51a) 

rj— —vV 

By using a more convenient notation p?[k] to denote p r (kT b ) and p 0 [k] to denote p 0 {kT b ) y 
Eq. (7.51a) can be expressed as 


N 

Polk] — ^2 c nPAh — w] k — 0, dbl, ±2, ±3,... (7,51b) 

n=—N 

Nyquist’s first criterion requires the samples p 0 [k] = 0 for k ^ 0. and a 1 = 1 for k = 0. 
Upon substituting these values in Eq. (7.51b), we obtain a set of infinite simultaneous equations 
in terms of 2N + 1 variables. Clearly, it is not possible to solve all the equations. However, if 
we specify the values of p 0 [k\ only at 2.Y + 1 points as 


p 0 m = 


1 

0 


it = 0 

k = ±1, ±2. ±N 


(7.52) 


then a unique solution exists. This assures that a pulse will have zero interference at sampling 
instants of N preceding and N succeeding pulses. Because the pulse amplitude decays rapidly, 
interference beyond the JVth pulse is not significant for .V > 2. in general. Substitution of the 
condition (7.52) into Eq. (7.51b) yields a set of 2N + 1 simultaneous equations for IN + 1 
variables. These 2 N + 1 equations can be rewritten in the matrix form of 


0 


0 

1 

0 



/M0] pA- 1] ■■■ p,[-2/V+l] Pr [-2N] 

Prl 1] Pr 1.0] ■■■ p r [-2N + 2] Prl—2N + 1 ] 


Prl2N-l] PA2N-2] ■■■ p,[0] p, [-1] 

p r [2N] p r [2N-l] ... p r [l] p r [0] 


Pr 


C— t W 
C-N+ J 

C~\ 

Co 

C[ 

QV-1 


C (7.53) 


In this compact expression, the (2N + 1) x (2N + 1) matrix P r has identical entries along all 
the diagonal lines. Such a matrix is known as the Toeplitz matrix and is commonly encountered 
in describing convolutive relationships, A Toeplitz matrix is fully determined by its first row 
and first column. It has some nice properties and admits simpler algorithms for computing its 
inverse (see, e.g +1 the method by Trench 7 ). The tap gain c* can be obtained by solving this set 
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of equations by taking the inverse of the matrix P 

c = 


Example 7.3 For the received pulse p r {t) in Fig. 7.22b, let 


Pr[0] = 1 

Ml] = -0.3 M2] =0.1 

Pr [-11 =-0.2 p r [-2] = 0.05 


Design a three-tap (TV = 1) equalizer. 


Substituting the foregoing values in Eq. (7.53), we obtain 


' 0 


1 

-0.2 

0.05 ” 

■ 

1 

== 

-0.3 

1 

-0.2 


0 


0.1 

-0.3 

1 



(7-54) 


Solution of this set yields c_i = 0.210, cq = 1.13, and ci = 0.3 IS. This tap setting assures 
us that po [01 = 1 and/?o[-l] = L1J =0- The ideal output p 0 {t) is sketched in Fig. 7.22c. 


Note that the equalizer determined from Eq. (7.53) can guarantee only the zero 1ST con¬ 
dition of Eq. (7.52). In other words, ISI is zero only for k = 0, ±1, ... * ±N, In fact, for k 
outside this range, it is quite common that the samples p 0 {kT b ) ^ 0, indicating some residual 
ISI. For instance, consider the equalizer problem in Example 7*3. The samples of the equalized 
pulse has zero ISI for k — - 1, 0, 1. However, from 

jV 

Polk] = ^ C n pAk - n] 

n=— A r 


we can see that the three-tap zero-forcing equalizer parameters will lead to 

M-3J= 0.010 ^[—21 = 0.0145 p 0 [ 2] = 0.0176 
Pol 3] = 0.0318 p 0 [k] = 0 k = 0, ±1, ±4, ... 

It is therefore clear that not all the ISI has been removed because of these four nonzero samples 
of the equalizer output pulse. In fact, because we only have 2N + 1 (AT = l in Example 7.3) 
parameters in the equalizer, it is impossible to force p 0 lk] — 0 for all k unless N = oo. This 
means that we will not be able to design a practical finite tap equalizer that achieves perfect 
zero ISI. Still, when N is sufficiently large, then typically the residual nonzero sample values 
will be small, indicating that most of the ISI has been suppressed 
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Minimum Mean Square Error (MMSE) Method 

In practice, an alternative approach is to minimize the mean square difference between the 
equalizer output response p 0 [k\ and the desired zero IS1 response. This is known as the mini¬ 
mum mean square error (MMSE) method for designing transversal filter equalizers. The MMSE 
method does not try to force the pulse samples to zero at 2N points. Instead, we minimize the 
squared errors averaged over a set of output samples. This method involves more simultaneous 
equations. Thus we must find the equalizer tap values to minimize the average (mean) square 
error over a larger window [— K, K]: 


MSE ~ 2F+I S (PoW-sm 2 


where we use a function known as the Kronecker delta 


*[*] = 


fl it = 0 

(0 A / 0 


The solution to this minimization problem can be better represented in matrix form as 


c ■ 


where Pj represents the Moore-Penrose pseudo-inverse of the nonsquare matrix P r of size 
(2 K + 1) x {2N + 1). The MMSE design often leads to a more robust equalizer for the 
reduction ofISI. 


Adaptive Equalization and Other More General Equalizers 

The equalizer filter structure that is described here has the simplest form. Practical digital 
communication systems often apply much more sophisticated equalizer structures and more 
advanced equalization algorithms. 6 Because of the probabilistic tools needed, we will defer 
detailed coverage on the specialized topic of equalization to Chapter 12. 


7.5.2 Tinning Extraction 

The received digital signal needs to be sampled at precise instants. This requires a clock signal 
at the receiver in synchronism with the clock signal at the transmitter (symbol or bit synchro¬ 
nization), delayed by the channel response. Three general methods of synchronization exist: 

1. Derivation from a primary or a secondary standard (e.g., transmitter and receiver slaved to 
a master timing source). 

2. Transmitting a separate synchronizing signal (pilot clock). 

3. Self-synchronization, where the timing information is extracted from the received signal 
itself. 

Because of its high cost, the first method is suitable for large volumes of data and high-speed 
communication systems. The second method, in which part of the channel capacity is used 
to transmit timing information, is suitable when the available capacity is large in comparison 
to the data rate and when additional transmission power can be spared. The third method is 
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Figure 7*23 

Timing 

extraction. 


a very efficient method of timing extraction or clock recovery because the timing is derived 
from the received message signal itself. An example of the self-synchronization method will 
be discussed here. 

We have already shown that a digital signal, such as an on-off signal (Fig, 7.3a), contains 
a discrete component of the clock frequency itself (Fig. 7.3c). Hence, when the on-off binary 
signal is applied to a resonant circuit tuned to the dock frequency, the output signal is the 
desired clock signal. 

Not all the binary signals contain a discrete component of the clock frequency. For example, 
a bipolar signal has no discrete component of any frequency [see Eq. (7.21) or Fig. 7,9]. In such 
cases, it may be possible to extract timing by using a nonlinear device to generate a frequency 
tone that is related to the timing clock. Tn the bipolar case, for instance, a simple rectification 
converts a bipolar signal to an on-off signal, which can readily be used to extract timing. 

Small random deviations of the incoming pulses from their ideal location (known as timing 
jitter) are always present, even in the most sophisticated systems. Although the source emits 
pulses at the right instants, subsequent operations during transmission (e.g., Doppler shift) 
tend to cause pulses to deviate from these original positions. The Q of the tuned circuit used 
for timing extraction must be large enough to provide an adequate suppression of liming jitter, 
yet small enough to meet the stability requirements. During the intervals in which there are 
no pulses in the input, the oscillation continues because of the flywheel effect of the high-<2 
circuit. But still the oscillator output is sensitive to the pulse pattern; for example, during a 
long string of Is the output amplitude will increase, whereas during a long string of Os it will 
decrease. This introduces additional jitter in the timing signal extracted. 

The complete timing extractor and rime pulse generator for a polar case in shown in 
Fig. 7.23. The sinusoidal output of the oscillator (riming extractor) is passed through a phase 
shifter that adjusts the phase of the timing signal so that the timing pulses occur at the maximum 
points. This method is used to recover the clock at each of the regenerators in a PCM system. The 
jitter introduced by successive regenerators adds up, and after a certain number of regenerators 
it is necessary to use a regenerator with a more sophisticated clock recovery system such as a 
phase-locked loop. 
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Timing Jitter 

Variations of the pulse positions or sampling instants cause timing jitter. This results from 
several causes, some of which are dependent on the pulse pattern being transmitted, whereas 
others are not. The former are cumulative along the chain of regenerative repeaters, since all 
the repeaters are affected in the same way, whereas the other forms of jitter are random from 
regenerator to regenerator and therefore tend to partially cancel out their mutual effects over 
a long-haul link. Random forms of jitter are caused by noise, interference, and mistuning of 
the clock circuits. Pattern-dependent jitter results from clock mistuning, amplitude-to-phase 
conversion in the clock circuit, and ISI, which alters the position of the peaks of the input 
signal according to the pattern. The rms value of the jitter over a long ehain of N repeaters can 
be shown to increase as VN. 

Jitter accumulation over a digital link may be reduced by buffering the link with an elastic 
store and clocking out the digit stream under the control of a highly stable phase-locked loop. 
Jitter reduction is necessary about every 200 miles in a long digital link to keep the maximum 
jitter within reasonable limits. 

7.5.3 Detection Error 

Once the transmission has passed through the equalizer, detection can take place at the detector 
that samples the received signal based on the clock provided by the timing extractor. The signal 
received at the detector consists of the equalized pulse train plus a random channel noise. The 
noise can cause error in pulse detection. Consider, for example, the case of polar transmission 
using a basic pulse p(t) (Fig. 7.24a). This pulse has a peak amplitude A pt A typical received 
pulse train is shown in Fig. 7.24b. Pulses are sampled at their peak values. If noise were absent, 
the sample of the positive pulse (corresponding to 1) would be A p and that of the negative 
pulse (corresponding to 0) would be — A p .* Because of noise, these samples would be ±A P 
where n is the random noise amplitude (see Fig. 7.24b). From the symmetry of the situation, 
the detection threshold is zero; that is, if the pulse sample value is positive, the digit is detected 
as 1 ; if the sample value is negative, the digit is detected as 0 . 


Figure 7.24 

Error probability 
in threshold 
detection. 
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The detector’s decision of whether to declare 1 or 0 could be made readily from the pulse 
sample, except that the noise value n is random, meaning that its exact value is unpredictable. 
Tt may have a large or a small value, and it can be negative as well as positive. It is possible 
that 1 is transmitted but n at the sampling instant has a large negative value. This will make 
the sample value A p -\- n small or even negative. On the other hand, if 0 is transmitted and n 
has a large positive value at the sampling instant, the sample value — A p + n can be positive 
and the digit will be detected wrongly as 1. This is clear from Fig, 7,24b. 

The performance of digital communication systems is typically specified by the average 
number of detection errors. For example, if two cellphones (receivers) in the same spot are 
attempting to detect the same transmission from a cellular tower, the cellphone with the lower 
number of detection errors is the better receiver. It is likely to have fewer dropped calls and less 
trouble receiving clear speech. However, because noise is random, sometimes one cellphone 
may be better while other times the other cellphone may have fewer errors. The real measure 
of receiver performance is therefore the average ratio of the number of errors to the total 
number of transmitted data. Thus, the meaningful performance comparison is the likelihood 
of detection error, or the detection error probability. 

Because the precise analysis and evaluation of this error likelihood require the knowledge 
and tools from probability theory, we will postpone error analysis until after the introduction 
of probability in Chapter 8. Later, in Chapter 10, we will discuss fully the error probability 
analysis of different digital communication systems for different noise models as well as system 
designs against different noises. For example, Gaussian noise can generally characterize the 
random channel noise from thermal effects and intersystem cross talk. Optimum detectors 
can be designed to minimize the error likelihood against Gaussian noise. However, switching 
transients, lightning strikes, power line load switching, and other singular events cause very 
high level noise pulses of short duration to contaminate the cable pairs that carry digital signals. 
These pulses, collectively called impulse noise, cannot conveniently be engineered away, and 
they constitute the most prevalent source of errors from the environment outside the digital 
systems. Errors are virtually never, therefore, found in isolation, but occur in bursts of up to 
several hundred at a time. To correct error burst, we use special burst error correcting codes 
described in Chapter 14. 


7.6 EYE DIAGRAMS: AN IMPORTANT TOOL 

In the last section, we studied the effect of noise and channel ISI on the detection of digital 
transmissions. We also described the design of equalizers to compensate the channel dis¬ 
tortion and explained the timing-extraction process. We now present a practical engineering 
tool known as the eye diagram. The eye diagram is easy to generate and is often applied 
by engineers on received signals because it makes possible the visual examination of sever¬ 
ity of the ISI, the accuracy of timing extraction, the noise immunity, and other important 
factors. 

We need only a basic oscilloscope to generate the eye diagram. Given a baseband signal 
at the channel output 


y(0 = L ak P (t ~ kT ^ 

it can be applied to the vertical input of the oscilloscope. The time base of the scope is triggered 
at the same rate \/T b as that of the incoming pulses, and it yields a sweep lasting exactly T blr 
the interval of one transmitted data symbol a k . The oscilloscope shows the superposition of 
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Figure 7,25 

The eye 
diagram. 



many traces of length T b from the channel output y{r)* What appears on the oscilloscope is 
simply the input signal (vertical input) cut up every T b and then superimposed on top of one 
another. The resulting pattern on the oscilloscope looks like a human eye, hence the name eye 
diagram. More generally, we can also apply a time sweep that lasts m symbol intervals, or mT}>. 
The oscilloscope pattern is simply the input signal (vertical input) cut up every mT b and then 
superimposed on top of one another. The oscilloscope will then display an eye diagram that is 
mT b wide and has the shape of m eyes in a horizontal row. 

We now present an example. Consider the transmission of a binary signal by polar NRZ 
pulses (Fig. 7.25a). Its eye diagrams are shown in Fig. 7.25b for the time base of T b and 
27fc, respectively. In this example, the channel has infinite bandwidth to pass the NRZ pulse 
and there is no channel distortion. Hence, we obtain eye diagrams with totally open eye(s). 
We can also consider a channel output using the same polar line code and a different (RZ) 
pulse shape, as shown in Fig. 7.25c. The resulting eye diagrams are shown in Fig. 7.25d. 
In this case, the eye is wide open only at the midpoint of the pulse duration. With proper 
timing extraction, the receiver should sample the received signal right at the midpoint where 
the eye is totally open, to achieve the best noise immunity at the decision point (Sec. 7.5.3). 
This is because the midpoint of the eye represents the best sampling instant of each pulse, 
where the pulse amplitude is maximum without interference from any other neighboring pulse 
(zero ISI)* 

We now consider a channel that is distortive or has finite bandwidth, or both. After passing 
through this nonideal channel, the NRZ polar signal of Fig. 7.25a becomes the waveform of 
Fig* 7*25e* The received signal pulses are no longer rectangular but are be rounded, distorted, 
and spread out. The eye diagrams are not fully open anymore, as shown in Fig* 7*25f* In this 
case, the ISI is not zero* Hence, pulse values at their respective sampling instants will deviate 
from the full-scale values by a varying amount in each trace, causing blurs, resulting in a 
partially dosed eye pattern. 

In the presence of channel noise, the eye will tend to close in all cases. Weaker noise 
will cause proportionately less closing. The decision threshold with respect to which symbol 
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Figure 7.26 
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(1 orO) was transmitted is the midpoint of the eyeT Observe that for zero ISI, the system can 
tolerate noise of up to half the vertical opening of the eye. Any noise value larger than this 
amount can cause a decision error if its sign is opposite to the sign of the data symbol* Because 
IS I reduces the eye opening, it clearly reduces noise tolerance. The eye diagram is also used 
to determine optimum tap settings of the equalizer. Taps are adjusted to obtain the maximum 
vertical and horizontal eye opening. 

The eye diagram is a very effective tool for signal analysis during real-time experiments. 
It not only is simple to generate, it also provides very rich and important information about the 
quality and susceptibility ot the received digital signal. From the typical eye diagram given in 
Fig. 7.26, we can extract several key measures regarding the signal quality. 

' Maximum opening point . The eye opening amount at the sampling and decision instant 
indicates that amount of noise the detector can tolerate without making an error. The quantity 
is known as the noise margin. The instant of maximum eye opening indicates the optimum 
sampling or decision-making instant. 

' Sensitivity to timing jitter. The width of the eye indicates the time interval over which correct 
decision can still be made, and it is desirable to have an eye wfith the maximum horizontal 
opening. If the decision-making instant deviates from the instant when the eye has a maximum 
vertical opening, the margin of noise tolerance is reduced. This causes higher error probability 
in pulse detection. The slope of the eye shows how fast the noise tolerance is reduced and, 
hence, the sensitivity of the decision noise tolerance to variation of the sampling instant. It 
demonstrates the effects of timing jitter. 

* Level-crossing (timing) jitter Typically, practical receivers extract timing information about 
the pulse rate and the sampling clock from the (zero) level crossing of the received signal 
waveform. The variation of level crossing can be seen from the width of the eye corners. 
This measure provides information about the timing jitter such a receiver is expected to 
experience. 

Finally, we provide a practical eye diagram example for a polar signaling waveform. In 
this case, we select a cosine roll-off pulse that satisfy Nyquist’s first criterion of zero ISI. 
The roll-off factor is chosen to be r = 0.5. The eye diagram is shown in Fig. 7,27 for a time 
base of 27*. In fact, even for the same signal, the eye diagrams may be somewhat different 
for different time offset (or initial point) values. Figure 7,27a illustrates the eye diagram of 
this polar signaling waveform for a display time offset of 7i/2, whereas Fig. 7.27b shows the 


* This is true for a two-level decision [e.g., when p(t) and -p{t) are used for 1 and 0, respectively]. For a three-level 
decision (e.g,, bipolar signaling), there will be two thresholds. 
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Figure 7.27 
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normal eye diagram when the display time offset value is zero. It is clear from comparison 
that these two diagrams have a simple horizontal circular shift relationship. By observing the 
maximum eye opening, we can see that this baseband signal has zero ISI, confirming the basic 
feature of the raised-cosine pulse. On the other hand, because Nyquist’s first criterion places no 
requirement on the zero crossing of the pulse, the eye diagram indicates that timing jitter would 
be likely. 


7.7 PAM: M-ARY BASEBAND SIGNALING FOR 
HIGHER DATA RATE 


Regardless of which line code is used, binary baseband modulations have one thing in common: 
they all transmit one bit of information over the interval of Tf } second, or at the bit rate of 1/7^ 
bit per second If the transmitter would like to send bits at a much higher rate, Th may be 
shortened. For example, to increase the bit rate by M, Th must be reduced by the same factor 
of M ; however, there is a heavy price to be paid in bandwidth. As we demonstrated in Fig. 7.9, 
the bandwidth of baseband modulation is proportional to the pulse rate l/7i, Shortening 7), 
by a factor of M will certainly increase the required channel bandwidth by M. Fortunately, 
reducing T& is not the only way to increase data rate. A very effective practical solution is to 
allow each pulse to carry multiple bits. We explain this concept here. 

For each symbol transmission within the time interval of 7^ to carry more bits, there 
must be more than two symbols to choose from. By increasing the number of symbols to M, 
we ensure that the information transmitted by each symbol will also increase with M . For 
example, whenM = 4 (4-ary, or quaternary), we have four basic symbols, or pulses, available 
for communication (Fig. 7.28a). A sequence of two binary digits can be transmitted by just one 
4-ary symbol. This is because a sequence of two bits can form only four possible sequences 
(viz., 11,10,01, and 00). Because we have four distinct symbols available, we can assign one 
of the four symbols to each of these combinations (Fig. 7.28a). Each symbol now occupies a 
time duration of 7^. A signaling example for a short sequence is given in Fig. 7.28b and the 
4-ary eye-diagram is shown in Fig. 7.28c. 

This signaling allows us to transmit each pair of bits by one 4-ary pulse (Fig. 7.28b). 
Hence, to transmit n bits, we need only (nf 2) 4-ary pulses. This means one 4-ary symbol can 
transmit the information of two binary digits. Also, because three bits can form 2x2x2^8 
combinations, a group of three bits can be transmitted by one 8-ary symbol. Similarly, a group 
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Figure 7.28 
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of four bits can be transmitted by one 16-ary symbol. In general, the information Im transmitted 
by an Af-ary symbol is 


Im = log 2 M bits (7.55) 

This means we can increase the rate of information transmission by increasing M . 

This special M -ary signaling is known as the pulse amplitude modulation (PAM) 
because the data information is conveyed by the varying pulse amplitude. We should note 
here that pulse amplitude modulation is only one of many possible choices of Af-ary signaling. 
There are an infinite number of such choices. Still, only a limited few are truly effective in 
combating noise and efficient in saving bandwidth and power. A more detailed discussion of 
other M - ary signaling schemes will be presented a little later, in Sec. 7.9. 

As in most system designs, there are always prices to pay for every possible gain. The 
price paid by PAM to increase data rate is power. As M increases, the transmitted power also 
increases as M . This is because to have the same noise immunity, the minimum separation 
between pulse amplitudes should be comparable to that of binary pulses. Therefore, pulse 
amplitudes increase with M (Fig. 7.28). It can be shown that the transmitted power increases 
as M 1 (Prob. 7.7-5). Thus, to increase the rate of communication by a factor of log 2 M, the 
power required increases as M 2 . Because the transmission bandwidth depends only on the 
pulse rate and not on pulse amplitudes, the bandwidth is independent of A/. We will use the 
following example of PSD analysis to illustrate this point. 


Example 7.4 Determine the PSD of the quaternary (4-ary) baseband signaling in Fig. 7.28 when the message 
bits 1 and 0 are equally likely. 


The 4-ary line code has four distinct symbols corresponding to the four different 
combinations of two message bits. One such mapping is 


—3 message bits 00 
-1 message bits 01 
-f 1 message bits 10 
■f 3 message bits 11 


(7.56) 


Therefore, all four values of a* are equally likely, each with a chance of 1 in 4. Recall that 
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Within the summation, 1/4 of the will be ±1, and ±3, Thus, 


1 T N . AT* N , N 

Ra= lim t: T (-3)- + -(-l) 2 + +-(l) 2 + +- 

n^-oc N 4 4 4 4 


On the other hand, for n > 0, we need to determine 


A* = J im 

£ 

To find this average value, we build a table with all the possible values of the product 


Possible Values of 



From the foregoing table listing all the possible products of a k a k+ny we see that each 
product in the summation iikau+n can take on any of the following six values =bl, ±3, ±9. 
First, (d= 1 , ± 9 ) are equally likely (1 in 8). On the other hand, ±3 are equally likely (1 in 
4)* Thus, we can show that 

1 TV N N N N N I 

*«= Lim - T (-9) + -(+9) + -(-l) + -(+l) + -(-3) + -(+3) =0 
jV->oo N 8 8 8 8 4 4 


W) = 7^(OI 2 

Thus, the Atf-ary line code generates the same PSD shape as binary polar signaling. 
The only difference is that it utilizes 5 times the original signal power. 


Although most terrestrial digital telephone network uses binary encoding, the subscriber 
loop portion of the integrated services digital network (ISDN) uses the quaternary code, 2B IQ, 
similar to Fig. 7.28a. It uses NRZ pulses to transmits 160 kbit/s of data at a baud rate (pulse 
rate) of 80 kbit/s. Of the various line codes examined by the ANSI standards committee, 2B IQ 
provided the greatest baud rate reduction in the noisy and crossrtalk'prone local cable plant 
environment. 

Pulse Shaping and Eye Diagrams in PAM: In this case, we can use the Nyquist criterion 
pulses because these pulses have zero ISI at the sample points, and, therefore, their amplitudes 
can be correctly detected by sampling at the pulse centers. We can also use the controlled ISI 
(partial-response signaling) for M -ary signaling. 8 
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Figure 7.29 
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Eye diagrams can also be generated for M - ary PAM by using the same method used for 
binary modulations. Because of multilevel signaling, the eye diagram should have M levels 
at the optimum sampling instants even when ISI is zero. Here we generate the practical eye 
diagram example for a four-level PAM signal that uses the same cosine roll-off pulse with 
roll-off factor r = 0*5 that was used in the eye diagram of Fig. 7,27. The corresponding eye 
diagrams with time offsets of T b j2 and 0 are given in Fig. 7.29a and b, respectively. Once 
again, no ISI is observed at the sampling instants. The eye diagrams clearly show four equally 
separated signal values without ISI at the optimum sampling points. 


7.8 DIGITAL CARRIER SYSTEMS 

Thus far, we have discussed baseband digital systems, where signals are transmitted directly 
without any shift in frequency. Because baseband signals have sizable power at low frequencies, 
they are suitable for transmission over a pair of wires and coaxial cables. Much of the modem 
communication is conducted this way. However, baseband signals cannot be transmitted over a 
radio link or satellites because this would necessitate impractically large antennas to efficiently 
radiate the low-frequency spectrum of the signal. Hence, for these applications, the signal 
spectrum must be shifted to a high-frequency range. A spectrum shift to higher frequencies 
is also required to transmit several messages simultaneously by sharing the large bandwidth 
of the transmission medium. As seen in Chapter 4, the spectrum of a signal can be shifted 
to a higher frequency by applying the baseband digital signal to modulate a high-frequency 
sinusoid (carrier). 

In transmitting and receiving digital carrier signals, we need a modulator and demodulator 
to transmit and receive data. The two devices, modulator and demodulator are usually packaged 
in one unit called a modem for two-way (duplex) communications. 

7.8.1 Basic Binary Carrier Modulations 

There are two basic forms of carrier modulation; amplitude modulation and angle modulation. 
In amplitude modulation, the carrier amplitude is varied in proportion to the modulating signal 
(i.e., the baseband signal). This is shown in Fig. 7.30, An unmodulated carrier cos a> c t is shown 
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Figure 7.30 
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in Fig. 7.30a. The on-off baseband signal m{t) (the modulating signal) is shown in Fig. 7.30b. 
It can be written according to Eq. (7.1) as 


m{t) — ^^akp(t — kTf } ) f where p(t) = FI 

The line code — 0, 1 is on-off. When the carrier amplitude is varied in proportion to m(f), 
we can write the carrier modulated signal as 

<p ASIC (0 = m(r) cos oy c t (7.57) 

shown in Fig. 7.30c. Note that the modulated signal is still an on-off signal. This modulation 
scheme of transmitting binary data is known as on-off keying (OOK) or amplitude shift 
keying (ASK). 

Of course, the baseband signal m(t) may utilize a pulse p(t) different from the rectangular 
one showm in the example of Fig. 7.30. This will generate an ASK signal that does not have a 
constant amplitude during the transmission of 1 ( a * = 1). 

If the baseband signal m(t) were polar (Fig. 7.31a), the corresponding modulated signal 
m{t) cos a> c t would appear as shown in Fig. 7,31b. In this case, if p(t) is the basic pulse, we 
are transmitting 1 by a pulse p{i) cos and 0 by —p(t) cos oj c t — pij) cos (avr-l-jr). Hence, 
the two pulses are tt radians apart in phase. The information resides in the phase or the sign 
of the pulse. For this reason this scheme is known as phase shift keying (PSK). Note that 
the transmission is still polar. In fact, just like ASK, the PSK modulated carrier signal has the 
same form 


m 


<p PSK (/) = m(t) cos w c t m(t) = ^2 & kPU ~ Wi) (7.58) 

with the difference that the line code is polar = ±1. 

When data are transmitted by varying the frequency, we have the case of frequency shift 
keying (FSK), as shown in Fig. 7.31c. A 0 is transmitted by a pulse of frequency and 1 is 
transmitted by a pulse of frequency co C] . The information about the transmitted data resides in 
the carrier frequency. The FSK signal may be viewed as a sum of two interleaved ASK signals, 
one with a modulating frequency and the other with a modulating frequency a > Ci . We can 
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use the binary ASK expression of Eq. (7.57) to write the FSK signal as, 

<Pfsk (0 = E akp(t ~ kTb > cos ( 0 CI t + £^(1 - a k )p(t - kT h ) cos o> Cf] t 

where a& = 0, 1 is on-off. Thus the FSK signal is a superposition of two AM signals with 
different carrier frequencies and different but complementary' amplitudes. 

In practice, ASK as an on-off scheme is commonly used today in optical fiber communi¬ 
cations in the form of laser-intensity modulation. PSK is commonly applied in digital satellite 
communications and was also used in earlier telephone modems (2400 and 4800 bit/s). As for 
FSK, AT&T in 1962 developed one of the earliest telephone-line modems called 103 A; it uses 
FSK to transmit 300 bit/s at two frequencies, 1070 and 1270 Hz, and receives FSK at 2025 
and 2225 Hz. 


7.8.2 PSD of Digital Carrier Modulation 

We have just shown that the binary carrier modulations of ASK, PSK, and FSK can all be 
written into some forms of m(f) cos a > c t. To determine the PSD of the ASK, PSK, and FSK 
signals, it would be helpful for us to first find the relationship between the PSD of m{t) and 
the PSD of the modulated signal 


<p{t) — m{t) cos a) c t 


Recall from Eq. (3.80) that the PSD of <p(t) is 


S v (f) = 


lim 

T -*oo 


l*r(f)l 2 


T 
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where ^r{f ) is the Fourier transform of the truncated signal 

<pr(t) = <p(t)\u{t + 7/2) - u(t - Tf 2)] 

= m(0[«(f + 7/2) - u(t — T/2)\ cos a> c t 
= raj (f) cos io c t (2.59) 


Here mj(t) is the truncated baseband signal with Fourier transform Mjif )* Applying the 
frequency shift property [see Eq. (3.36)1, we have 

* 7 -</) - \ [M T (f -fc) + M T if +/,)] 

As a result, the PSD of the modulated carrier signal <p{t) is 

, ^ 1 | M T (J' +f c )+M T (f -f c )\ 2 


Because M(f ) is a baseband signal, Mjif +/ c ) —f c ) have zero overlap as T —> oq 

as long as f c is larger than the bandwidth of M(f), Therefore, we conclude that 


S^ff)— lim 


1 


\M r (f+fc)\- , \M T (f-f c )\ 


21 


+ 


T^oo 4 

+fc) + ~fc) 


(7.60) 


In other words, for an appropriately chosen carrier frequency, modulation causes a shift in the 
baseband signal PSD, 

Now, the ASK signal in Fig* 7*30c, fits this model, with m(t) being an on-off signal (using 
a full-width or NRZ pulse). Hence, the PSD of the ASK signal is the same as that of an on-off 
signal {Fig. 7.4b) shifted to ±f c as shown in Fig. 7.32a. Remember that by using a full-width 
rectangular pulse p(t). 


p (£h n=±i ’ ±2 '- 

In this case, the baseband on-off PSD has no discrete components except at dc in Fig* 7*30b. 
Therefore, the ASK spectrum has discrete component only at ■* 

The PSK signal also fits this modulation description where m{t) is a polar signal using 
a full-width NRZ pulse. Therefore, the PSD of a PSK signal is the same as that of the polar 
baseband signal shifted to ±£j c , as shown in Fig. 7.32b* Note that this PSD has the same shape 
(with a different scaling factor) as the PSD of the ASK minus its discrete components. 

Finally, we have shown that the FSK signal may be viewed as a sum of two interleaved 
ASK signals using the full-width pulse. Hence, the spectrum of FSK is the sum of two ASK 
spectra at frequencies and co C[ , as shown in Fig* 7.32c, It can be shown that by properly 
choosing and w C[ and by maintaining phase continuity during frequency switching, discrete 
components can be eliminated at a> CQ and o> t | * Thus, no discrete components appear in this 
spectrum* It is important to note that the bandwidth of FSK is higher than that of ASK or PSK. 

As observed earlier, polar signaling is the most power-efficient scheme. The PSK, being 
polar, requires 3 dB less power than ASK (or FSK) for the same noise immunity, that is, for 
the same error probability in pulse detection. 
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Figure 7.32 
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Of course, we can also modulate bipolar, or any other scheme discussed earlier. Also, note 
that the use of the NRZ rectangular pulse in Fig. 7.30 or 7,31 is for the sake of illustration only. 
In practice, baseband pulses may be spectrally shaped to eliminate ISI. 


7.8.3 Connections between Analog and Digital 
Carrier Modulations 

There is a natural and clear connection between ASK and AM because the message infor¬ 
mation is directly reflected in the varying amplitude of the modulated signals. Because of its 
nonnegative amplitude, ASK is essentially an AM signal with modulation index /i = L There 
is a similar connection between FSK and FM, FSK is simply an FM signal with only limited 
number of instantaneous frequencies. 

The connection between PSK and analog modulation is a bit more subtle. For PSK, the 
modulated signal can be written as 

<Ppsk (0 = A c °s (&ct T $k) kT b <t < kT b + T h 

It can therefore be connected with PM. However, a closer look at the PSK signal reveals that 
because of the constant phase 9% , its instantaneous frequency, in fact, does not change. In fact, 
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we can rewrite the PSK signal 

<p psK (f) = A cos 9 k cos q} c i - A sin 6k sin co c t 

= Qk cos m c t + h k sin kTh < t < kTh + 7/, (7.61) 

by letting a k = A cos 0* and bk = —A sin 0*. From Eq. (7.61), we recognize its strong resem¬ 
blance to the QAM signal representation in Sec. 4.4. Therefore, the digital PSK modulation is 
closely connected with the analog QAM signal. In particular, 0=0, ji for binary PSK. Thus, 
binary PSK can be written as 


=t A cos o) c t 

This is effectively a digital manifestation of the DSB-SC amplitude modulation, hi fact, as will 
be discussed later, by letting take on multilevel values while setting b k = 0 we can generate 
another digital carrier modulation known as the pulse amplitude modulation (or PAM), which 
can carry multiple bits during each modulation time-interval 7y 

As we have studied in Chapter 4, DSB-SC amplitude modulation is more power efficient 
than AM. Binary PSK is therefore more power efficient than ASK. In terms of bandwidth 
utilization, we can see from their connection to analog modulations that ASK and PSK have 
identical bandwidth occupation while FSK requires larger bandwidth. These observations 
intuitively corroborate our PSD results of Fig. 7.32. 


7.8.4 Demodulation 

Demodulation of digital-modulated signals is similar to that of analog-modulated signals. 
Because of the connections between ASK and AM, between FSK and FM, and between PSK and 
QAM (or DSB-SC AM), different demodulation techniques used for the analog modulations 
can be directly applied to their digital counterparts. 


ASK Detection 

Just like AM, ASK (Fig. 7.30c), can be demodulated both coherently (for synchronous detec¬ 
tion) or noncoheTently (for envelope detection). The coherent detector requires more elaborate 
equipment and has superior performance, especially when the signal power (hence SNR) is 
low. For higher SNR, the envelope detector performs almost as well as the coherent detector. 
Hence, coherent detection is not often used for ASK because it will defeat its very purpose 
(the simplicity of detection). If we can avail ourselves of a synchronous detector, we might as 
well use PSK, which has better power efficiency than ASK. 

FSK Detection 

Once again, the binary FSK can be viewed as two interleaved ASK signals with carrier fre¬ 
quencies co CG and w C[ , respectively (Fig. 7.32c). Therefore, FSK can be detected coherently or 
noncoherentiy. In noncoherent detection, the incoming signal is applied to a pair of filters tuned 
to to CQ and (o cl , respectively. Each filter is followed by an envelope detector (see Fig. 7.33a). 
The outputs of the two envelope detectors are sampled and compared. If a 0 is transmitted by 
a pulse of frequency then this pulse will appear at the output of the filter tuned to a) co . 
Practically no signal appears at the output of the filter tuned to o > C} . Hence, the sample of the 
envelope detector output following the filter will be greater than the sample of the envelope 
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Figure 7.33 

(a) Noncoherent 
detection of F$K, 

(b) Coherent 
detection of FSK. 



(a) 



Figure 7.34 

Coherent binary 
PSK detector 
(similar to a 
DSB-SC 
demodulator!. 
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eosw L r Decision 


1 
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detector output following the cu Cl filter, and the receiver decides that a 0 was transmitted. In 
the case of a 1, the opposite happens. 

Of course, FSK can also be detected coherently by generating two references of frequencies 
c °c$ and eo cl , tor the two demodulators, to demodulate the signal received and then comparing 
the outputs of the two demodulators as shown in Fig. 7.33b. Thus, coherent FSK detector 
must generate two carriers in synchronization with the modulation carriers. Once again, this 
complex demodulator defeats the purpose ot FSK, which is designed primarily for simpler^ 
noncoherent detection. In practice, coherent FSK detection is not in use. 


PSK Detection 

In binary PSK, a 1 is transmitted by a pulse A cos w c t and a 0 is transmitted by a pulse 
—A cos co c t (Fig. 7.31b). The information in PSK signals therefore resides in the carrier phase. 
Just as in DSB-SC, these signals cannot be demodulated via envelope detection because the 
envelope stays constant for both 1 and 0 (Fig. 7.31b). The coherent detector of the binary PSK 
modulation is shown in Fig. 7.34. The coherent detection is similar to that used for analog 
signals. Methods of carrier acquisition have been discussed in Sec. 4.8. 


Differential PSK 

Although envelope detection cannot be used tor PSK detection, it is still possible to exploit the 
finite number of modulation phase values for noncoherent detection. Indeed, PSK signals may 
be demodulated noncoherently by means of an ingenious method known as differential PSK, 
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Figure 7*35 

(a) DiFerentfal 
encoding; 

(b) encoded 
signal; 

(c) differencial 
PSK receiver. 
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or DPSK. The principle of differential detection is for the receiver to detect the relative phase 
change between successive modulated phases 0k and $k-\ - Since the phase value in PSK is 
finite (equaling to 0 and n in binary PSK), the transmitter can encode the information data into 
the phase difference 9 k — f^-i ■ For example, a phase difference of zero represents 0 whereas 
a phase difference of n signifies 1. 

This technique is known as differential encoding (before modulation). In one differential 
code, a 0 is encoded by the same pulse used to encode the previous data bit (no transition), and 
a 1 is encoded by the negative of the pulse used to encode the previous data bit (transition). 
Differential encoding is simple to implement, as shown in Fig. 7.35a. Notice that the addition 
is modulo-2. The encoded signal is shown in Fig. 7.35b. Thus a transition in the line code pulse 
sequence indicates 1 and no transition indicates 0. The modulated signal consists of pulses 

A cos {co c t + 0k) = cos ct) c t 

If the data bit is 0, the present pulse and the previous pulse have the same polarity or phase; both 
pulses are either A cos a> c t or — A cos a> c t. If the data bit is 1, the present pulse and the previous 
pulse are of opposite polarities or phases; if the present pulse is A cos Qi c t, the previous pulse 
is —A cos co c t f and vice versa. 

In demodulation of DPSK (Fig. 7.35c), we avoid generation of a local carrier by observ¬ 
ing that the received modulated signal itself is a carrier (±Acos a> c t) with a possible sign 
ambiguity. For demodulation, in place of the carrier, we use the received signal delayed 
by Ty (one bit interval). If the received pulse is identical to the previous pulse, the prod¬ 
uct >(0 = A 2 cos 2 oj c t = (A 2 /2)(l + cos 2a> c t) t and the low-pass filter output z(0 = A 2 / 2. 
We immediately detect the present bit as 0. If the received pulse and the previous pulse are of 
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TABLE 7,3 

Differentia! Encoding and Detection of Binary DPSK 


Time k 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

4 


1 

0 

I 

0 

0 

1 

1 

1 

0 

0 


0 

1 

1 

0 

0 

0 

1 

0 

1 

I 

1 

Line code 


1 

1 

-i 

-1 

-1 

1 

-1 

1 

1 

i 


7T 

0 

0 

JT 

71 

it 

0 

71 

0 

0 

0 

<4 - fy-i 


7T 

0 

71 

0 

0 

71 

71 

JT 

0 

0 

Detected bits 


1 

0 

1 

0 

0 

1 

1 

1 

0 

0 


opposite polarity, y(0 = —A 2 cos 2 (o c t and z(t) = —A 2 / 2, and the present bit is delected as 
0- Table 7.3 illustrates a specific example of the encoding and decoding. 

Thus, in terms of demodulation complexity, ASK, FSK, and DPSK can all be nonco¬ 
herent^ detected without a synchronous carrier at the receiver. On the other hand, PSK 
must be coherently detected. Noncoherent detection, however, conies with a price in terms 
of noise immunity. From the point of view of noise immunity, coherent PSK is superior to all 
other schemes. PSK also requires smaller bandwidth than FSK (see Fig. 7.32), Quantitative 
discussion of this topic can be found in Chapter 10. 


7.9 M-ARY DIGITAL CARRIER MODULATION 

The binary digital carrier modulations of ASK, FSK, and PSK all transmit one bitof information 
over the interval of T b second, or at the bit rate of 1/7* bit/s. Similar to digital baseband 
transmission, higher bit rate transmission can be achieved either by reducing T b or by applying 
M-ary signaling; the first option requires more bandwidth; the second requires more power. 
In most communication systems, bandwidth is strictly limited. Thus, to conserve bandwidth, 
an effective way to increase transmission data rate is to generalize binary modulation by 
employing M-ary signaling. Specifically, we can apply Mdevel ASK, M-frequency FSK, and 
M-phase PSK modulations. 

M-ary ASK and Noncoherent Detection 

M-ary ASK is a very simple generalization of binary ASK. Instead of sending only 

(pit) = 0 for 0 and <p(t) = A cos a> c t for 1 

M-ary ASK can send log 2 M bits each time by transmitting, for example, 

(p(t) = 0, A cos a) c t , 2A cos w c t t ♦ ♦., (M — l)Acos a) i: t 

This is still an AM signal that uses M different amplitudes and a modulation index of \x ~ 
1. Its bandwidth remains the same as that of the binary ASK, while its power is increased 
proportionally with M 2 . Its demodulation would again be achieved via envelope detection or 
coherent detection. 

M-ary FSK and Orthogonal Signaling 

M-FSK is similarly generated by selecting one sinusoid from the set {A cos lizft, 
i = 1, ..,, M) to transmit a particular pattern of log 2 M bits. Generally for FSK, we can 
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design a frequency increment 5/ and let 

+ W m=U.W 

For this FSK with equal frequency separation, the frequency deviation (in analyzing the FM 
signal) is 


A/ = j J±-A = I(M - l)S.f 

It is therefore dear that the selection of the frequency set {/}) determines the performance and 
the bandwidth of the FSK modulation. If Sf is chosen too large, then the M-ary FSK will use 
too much bandwidth. On the other hand, if Sf is chosen too small, then over the time interval 
of T b second, different FSK symbols will show virtually no difference and the receiver will be 
unable to distinguish the different symbols reliably. Thus large Sf leads to bandwidth waste, 
whereas small <5 f is prone to detection error due to transmission noise and interference. 

The task of M-ary FSK design is to determine a small enough Sf that each FSK sym¬ 
bol A cos a>it is highly distinct from all other FSK symbols. One solution to this problem 
of FSK signal design actually can be found in the discussion of orthogonal signal space in 
Sec. 2.6.2. If we can design FSK symbols to be orthogonal in T b by selecting a small <5/ 
(or A/), then the FSK signals will be truly distinct over T b , and the bandwidth consumption 
will be small. 

To find the minimum Sf that leads to an orthogonal set of FSK signals, the orthogonality 
condition according to Sec. 2.6.2 requires that 

f Tb 

I A cos (2jr/ ffl f) A cos (2 nf n i) dt = 0 m ^ n (7.62) 

JO 


We can use this requirement to find the minimum Sf. First of all, 


L 


n 


A cos (2 nf m t) A cos (2 izf n t) dt 


f [cos 2nif m +f n )t + cos 2 n(f m - f n )t]dt 

7 Jo 

A^ sin 2 Jtifm T fn)T b A? sin 27r(/ m — fnfT b 

^2 b 2 jt tf-+f n )T b " T b ' 2^~-ff)T b 

(7.63) 


Since in practical modulations, (f m +f n )T b is very large (often no smaller than lO 3 ), the first 
term in Eq. (7.63) is effectively zero and negligible. Thus, the orthogonality condition reduces 
to the requirement that for any integer 

A 2 sin 2n(f m -f„)T b _ 

2 2n(j m -f n ) 

Because f m =f\ -f- {m — 1)6/, for mutual orthogonality we have 

sin [2 7T{m — n)SjT b ] = 0 m^n 

From this requirement, it is therefore clear that the smallest Sf to satisfy the mutual 
orthogonality condition is 
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This choice of minimum frequency separation is known as the minimum shift FSK. Since it 
forms an orthogonal set of symbols, it is often known as orthogonal signaling. 

We can in fact describe the minimum shift FSK geometrically by applying the concept of 
orthonormal basis functions in Sec. 2.6. Let 


fiti) = J —cos lit 

n 


( /i+ W) 


i i = l,2. M 


It can be simply verified that 


L 


n 




1 m = n 
0 m ft n 


Thus, each of the FSK symbol can be written as 


A cos 2nf m t = m= 1, 2, . M 


The geometrical relationship of the two FSK symbols for M — 2 is easily captured by Fig. 736. 

The demodulation of M-ary FSK signals follows the same approach as the binary FSK 
demodulation. Generalizing the binary FSK demodulators of Fig. 7.33 we can apply a bank of 
M coherent or noncoherent detectors to the M-ary FSK signal before making a decision based 
on the strongest detector branch. 

Earlier in the PSD analysis of baseband modulations, we showed that the baseband digital 
signal bandwidth at the symbol interval of Tb can be approximated by 1 jT b . Therefore, for the 
minimum shift FSK, A f = (M - l)/(4 T b ), and its bandwidth according to Carson’s rule is 
approximately 


2(A/+5) = ^-^ 

In fact, it can be in general shown that the bandwidth of an orthogonal M-ary scheme 
is M times that of the binary scheme [see Sec. 10.7, Eq + (10.123)]. Therefore, in an M-aiy 
orthogonal scheme, the rate of communication increases by a factor of log 2 M at the cost of 
A/-fold transmission bandwidth increase. For a comparable noise immunity, the transmitted 
power is practically independent of M in the orthogonal scheme. Therefore, unlike M-ary ASK, 
M-ary FSK does not require more transmission power. However, its bandwidth requirement 
increases almost linearly with M (compared with binary FSK or M-ary ASK) t 
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Figure 7.37 
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M-ary PSK, PAM, and QAM 

By making a small modification to Eq. (7.61), PSK signals in general can be written into the 
format of 


*Ppsk W 



cos a) € t 4- ^ 



sm a) t t 


0 < t < Tb 


(7.64a) 


in which a m = A cos 6 m and b m = —A sin 9 m . In fact, based on the analysis in Sec. 2.6, 
*J2jTb cos a> c t and /Th sin co c t are orthogonal to each other. Furthermore, they are nor¬ 
malized over [0, 7)J. As a result, we can represent all PSK symbols in a two-dimensional 
signal space with basis functions 


0i (0 = 



02(0 = 


— sin co c t 
T b 


such that 


VW(0 = + (7.64b) 

We can geometrically illustrate the relationship of the PSK symbols in the signal space 
(Fig. 7.37). Equation (7.64) means that PSK modulations can be represented as QAM sig¬ 
nal. In fact, because the signal is PSK, the signal points must meet a special requirement 
that 


a m + b m = cos2 + {~A) 2 Sin 2 9 m 
= A 2 = constant 


(7,64c) 


In other words, all the signal points must stay on a circle of radius A. In practice, all the signal 
points are chosen to be equally spaced in the interest of obtaining the best immunity against 
noise. Therefore, for M-ary PSK signaling, the angles are typically chosen uniformly as 

2jt 

9 m = $o + ~rr( m -U m = I, 2, ..., M 
M 

The special PSK signaling with M = 4 is an extremely popular and powerful digi¬ 
tal modulation format/ It in fact is a summation of two binary PSK signals, one using the 


QPSK has several effective variations including the offset QPSK 
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(in-phase) carrier of cos co c t while the other uses the (quadrature) carrier of sin w c t of the same 
frequency. For this reason, it is also known as quadrature PSK (QPSK), We can transmit and 
receive both of these signals on the same channel, thus doubling the transmission rate. 

To further generalize the PSK to achieve higher data rate, we can see that the PSK repre¬ 
sentation of Eq. (7,64) is a special case of the quadrature amplitude modulation (QAM) signal 
discussed in Chapter 4 (Fig. 4.19). The only difference lies in the requirement by PSK that the 
modulated signal have a constant magnitude (modulus) A. In fact, the much more flexible and 
general QAM signaling format can be conveniently used for digital modulation as well. The 
signal transmitted by an Af-ary QAM system can be written as 

Piti) = aip(t) cos co c t + bip(t) sin w c t 

— np(t) cos (oj c t -ft) i = 1, 2,.. ., M 


where 


I - ^ _ 

= \j a? + bj and Q; = tan” 1 — (7.65) 

and pit) is a properly shaped baseband pulse. The simplest choice ot>(r) would be a rectangular 
pulse 


P (0 = tJjt [«<0 “«(/ - T b )\ 

Certainly, better pulses can also be applied conserve bandwidth. 

Figure 7*38a shows the QAM modulator and demodulator. Each of the two signals m\(t) 
and miit) is a baseband v^T-ary pulse sequence. The two signals are modulated by two 
carriers of the same frequency but in phase quadrature. The digital QAM signal pi(i) can be 
generated by means of QAM by letting m\{t) — aip(t) and M 2 (t) = ftp(r). Both m^t) and 
are baseband PAM signals. The eye diagram of the QAM signal consists of the in-phase 
component m\(t) and the quadrature component m 2 (t). Both exhibit the M-ary baseband PAM 
eye diagram, as discussed earlier in Sec. 7.6. 

The geometrical representation of M-ary QAM can be extended from the PSK signal 
space by simply removing the constant modulus constraint Eq* (7.64c). One very popular and 
practical choice of 77 and ft for M — 16 is shown graphically in Fig* 7.38b. The trans¬ 
mitted pulse pi(t) can take on 16 distinct forms, and is, therefore, a 16-ary pulse. Since 
U — 16, each pulse can transmit the information of log 2 16 = 4 binary digits. This can 
be done as follows: there are 16 possible sequences of four binary digits and there are 16 
combinations hi) in Fig. 7.38b. Thus, every possible four-bit sequence is transmitted by 
a particular (a t , ft) or (r,, ft). Therefore, one signal pulse / 777 (f) cos (oj c t - ft) transmits four 
bits. Compared with binary PSK (or BPSK), the 16-ary QAM bit rate is quadrupled without 
increasing the bandwidth. The transmission rate can be increased further by increasing the 
value of M . 

Modulation as well as demodulation can be performed by using the system in Fig* 7,38a, 
The inputs are my(t) = #;/>(f) and/ 712 (f) = ftp(0*Thetwo outputs at the demodulator are aip(t) 
and ftp(f). From knowledge of ( a ,, ft), we can determine the four transmitted bits. Further 
analysis of 16-ary QAM on a noisy channel is carried out in Sec. 10.6 [Eq. (10.104)]. The practi¬ 
cal value of this 16-ary QAM signaling becomes fully evident when we consider its broad range 
of applications. In fact, 16-QAM is used in the V*32 telephone data/fax modems (9600 bit/s), 
in high-speed cable modems, and in modern satellite digital television broadcasting. 
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Figure 7.38 

(a) QAM or 
quadrature 
multiplexing and 

(b] 16 -point 
QAM (M = 16 ), 



(a) 



(b) 


Note that if we disable the data stream that modulates sin oj c t in QAM, then all the signaling 
points can be reduced to a single dimension. Upon setting m 2 (t) — 0, QAM becomes 

Pitt) = GiPtt) cos w c t, t € [0, 7i] 

This degenerates into the pulse amplitude modulation or PAM. Comparison of the signal 
expression of/j,-(0 with the analog DSB-SC signal makes it clear that PAM is the digital version 
of the DSB-SC signal. Just as analog QAM is formed by the superposition of two DSB-SC 
amplitude modulations in phase quadrature, digital QAM consists of two PAM signals, each 
having signaling levels. Similarly, like the relationship between analog DSB-SC and 
QAM, PAM requires the same amount of bandwidth as QAM does. However, PAM is much 
less efficient because it would need M modulation signaling levels in one dimension, whereas 
QAM requires only *Jm signaling levels in each of the two orthogonal QAM dimensions. 

Trading Power and Bandwidth 

In Chapter 10 we shall discuss several other types of M -ary signaling* The nature of the 
exchange between the transmission bandwidth and the transmitted power (or SNR) depends 
on the choice of Atf-ary scheme* For example, in orthogonal signaling, the transmitted power is 
practically independent of M but the transmission bandwidth increases with M * Contrast this 
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to the PAM case, where the transmitted power increases roughly with M 2 while the bandwidth 
remains constant. Thus, M-ary signaling allows us great flexibility in trading signal power 
(or SNR) for transmission bandwidth. The choice of the appropriate system will depend upon 
the particular circumstances. For instance, it will be appropriate to use QAM signaling if the 
bandwidth is at a premium (as in telephone lines) and to use orthogonal signaling when power 
is at a premium (as in space communication). 


7.10 MATLAB EXERCISES 


In this section, we provide MATLAB programs to generate the eye diagrams. The first step is 
to specify the basic pulse shapes in PAM. The next four short programs are used to generate 
NRZ, RZ, half-sinusoid, and raised-cosine pulses* 


% (pnrz.m) 

% generating a rectangular pulse of width T 

% Usage function pout=pnrz(T); 

function pout^prect(T)? 

pout=ones(1,T}; 

end 


% {prz.m) 

% generating a rectangular pulse of width T/2 
% Usage function pout-prz(T); 
function pout=pr 2 (T); 

pout = [zeros{1, TV4) ones(1,T/2) zeros{1,T/4)]; 
end 


% (psine.m) 

% generating a sinusoid pulse of width T 
% 

function pout=psine{T); 
pout=sin(pi*[0:T-1]/T); 
end 


% (prcos.m) 

% Usage y=prcos(rollfac,length, T) 
function y^prcos(rollfac.length; T) 

% rollfac = 0 to 1 is the rolloff factor 
% length is the onesided pulse length in the number of T 
% length = 2T+1; 

% T is the oversampling rate 

y=rcosfir(rollfac, length, T,l, 'normal 1 }; 

end 
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The first program {binary_eye *m) uses the four different pulses to generate eye 
diagrams of binary polar signaling. 


% (binary_eye,m) 

% generate and plot eyediagrams 
% 

clear;elf ; 

data = sign(randn(1,400)); % Generate 400 random bits 

Tau=64; % Define the symbol period 

dataup=upsample(data, Tau); % Generate impulse train 
yrz=conv(dataup,prz(Tau)); % Return to zero polar signal 

yrz=yrz(1:end-Tau+1}; 

ynrz=conv(dataup,pnrz(Tau)); % Non-return to zero polar 
ynrz=ynrz(1:end-Tau+1); 

ysine=conv(dataup,psine(Tau)); % half sinusoid polar 

ysine=ysine(1:end-Tau+1); 

Td=4; % truncating raised cosine to 4 periods 

yrcos=conv(dataup,prcos(0.5,Td,Tau)); % rolloff factor =0*5 
yrcos=yrcos(2*Td*Tau:end-2*Td*Tau+l); % generating RC pulse train 
eyel = eyediagram(yrz,2*Tau,Tau,Tau/2) ;title( H RZ eye-diagram') ; 
eye2 = eyediagram(ynrz,2*Tau,Tau,Tau/2) ;title('NRZ eye-diagram') ; 
eye3=eyediagram[ysine,2*Tau,Tau,Tau/2);title('Half-sine eye-diagram'); 
eye4=eyediagram(yrcos,2*Tau,Tau) ; title('Raised-cosine eye-diagram 1 ) ; 


The second program (Mary_eye*m) uses the four different pulses to generate eye 
diagrams of four-level PAM signaling* 


% (Mary_eye .m) 

% generate and plot eyediagrams 
% 

% 

clear;elf; 

data = sign(randn(1,400))+2* sign(randn(1,400)); % 400 PAH symbols 
Tau-64; % Define the symbol period 

dataup=upsample(data, Tau); % Generate impulse train 
yrz=conv(dataup,prz(Tau)); % Return to zero polar signal 

yrz=yrz(1:end-Tau+1); 

ynrz=conv(dataup,pnrz(Tau)); % Non-return to zero polar 
ynrz=ynrz(1;end-Tau+1); 

ysine=conv(dataup,psine(Tau)); % half sinusoid polar 

ysine=ysine(1:end-Tau+1); 

Td=4; % truncating raised cosine to 4 periods 

yrcos=conv[dataup H prcos(0 * 5,Td,Tau)); % rolloff factor = 0.5 
yrcos=yrcos(2*Td*Tau t end-2*Td*Tau+l); % generating RC pulse train 
eyel = eyediagram(yrz,2*Tau,Tau,Tau/2);title ('RZ eye-diagram'); 
eye2=eyediagram(ynrz,2*Tau,Tau,Tau/2);title('NRZ eye-diagram') ; 
eye3=eyediagram(ysine,2*Tau,Tau,Tau/2);title('Half-sine eye-diagram'); 
eye4=eyediagram(yrcos,2*Tau,Tau); title('Raised-cosine eye-diagram'); 



388 PRINCIPLES OF DIGITAL DATA TRANSMISSION 


REFERENCES 

1. A, Lender, “Duobinary Technique for High Speed Data Transmission," IEEE Trans . Commim. 

Electron., vol. CE-82, pp. 214-218, May 1963. 

2. A, Lender, “Correlative Level Coding for Binary-Data Transmission,” IEEE Spectrum, vol. 3, no. 2, 

pp. 104-115, Feb, 1966. 

3. P Bylanski and D, G. W, Ingram, Digital Transmission Systems, Peter Peregrinus Ltd., Hertshire, 

England, 1976, 

4. H. Nyquist, “Certain Topics in Telegraph Transmission Theory," AIEE Trans., vol. 47, p. 817, April 

1928* 

5. E. D. Sunde, Communication Systems Engineering Technology, Wiley, New York, 1969. 

6. R. W. Lucky and H. R. Rudin, “Generalized Automatic Equalization for Communication Channels,” 

IEEE Int. Commun. Conf, vol. 22, 1966. 

7. W. F. Trench, “An Algorithm for the Inversion of Finite Toeplitz Matrices,” SIAM, vol, 12, 

pp, 515-522, Sept. 1964, 

8. A. Lender, Chapter 7, in Digital Communications: Microwave Applications, K, Feher, Ed., Prentice- 

Hall, Englewood Cliffs, NJ, 1981, 


PROBLEMS 

7*2-1 Consider a fuil-width rectangular pulse shape 

Pit) = n(f/7i) 

(a) Find PSDs for the polar, on-off, and bipolar signaling. 

(b) Sketch roughly the PSDs and find their band widths. For each case, compare the bandwidth 
to the case where p{t) is a half-width rectangular pulse. 

7.2- 2 (a) A random binary data sequence 110100101 * * is transmitted by using a Manchester (split- 

phase) line code with the pulse p(t) shown in Fig, 7.7a. Sketch the waveform y(r). 

(b) Derive S y (f ), the PSD of a Manchester (split-phase) signal in part (a) assuming 1 and 0 
equally likely. Roughly sketch this PSD and find its bandwidth. 

7.2- 3 If the pulse shape is 

P(f) = n (o37^) 

use differential code (see Fig, 7.18) to derive the PSD for a binary signal. Determine the PSD 

W 

7.2- 4 The duobinary line coding proposed by Lender is also ternary like bipolar, but it requires only 

half the bandwidth of bipolar. In practice, duobinary coding is indirectly realized by using a 
special pulse shape as discussed in Sec. 7.3 (see Fig, 7,18). In this code, a 0 is transmitted by no 
pulse, and a 1 is transmitted by a pulse p(r) or —p(t) using the following rule. A 1 is encoded 
by the same pulse as that used for the previous 1 if there are an even number of 0s between 
them. It is encoded by a pulse of opposite polarity if there are an odd number of 0s between 
them. A number 0 is considered to be an even number Like bipolar, this code also has a single 
error detection capability, because correct reception implies that between successive pulses of 
the same polarity, an even number of 0s must occur, and between successive pulses of opposite 
polarity, an odd number of 0s must occur. 
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(a) Assuming half-width rectangular pulse, sketch the duobinary signal y(f) for the random 
binary sequence 


1110001101001010 


(b) Determine /?(>, R\ t and /?2 for this code. Assume (or you may show if you like) that R„ = 0 
for ail n > 2 , Find and sketch the PSD for this line code (assuming half-wndth pulse). Show 
that its bandwidth is R^f 2 Hz, half that of bipolar. 


7,3-1 Data at a rate of 6 kbit/s is to be transmitted over a leased line of bandwidth 4 kHz by using 
Nyquist criterion pulses. Determine the maximum value of the roll-off factor r that can be used. 


7*3-2 In a certain telemetry system, there are eight analog measurements, each of bandwidth 2 kHz. 
Samples of these signals are time-division-multiplexed* quantized, and binary-coded. The error 
in sample amplitudes cannot be greater than 1 % of the peak amplitude, 

(a) Determine Z,, the number of quantization levels. 

(b) Find the transmission bandwidth E? if Nyquist criterion pulses with roll-off factor r — 0,2 
arc used. The sampling rate must be at least 25% above the Nyquist rate. 

7*3-3 A leased telephone line of bandwidth 3 kHz is used to transmit binary data. Calculate the data 
rate (in bits per second) that can be transmitted if we use: 

(a) Polar signal with rectangular half-width pulses. 

(b) Polar signal with rectangular full-width pulses. 

(c) Polar signal using Nyquist criterion pulses of r = 0,25, 

(d) Bipolar signal with rectangular half-width pulses, 

(e) Bipolar signal with rectangular full-width pulses. 


7,3-4 The Fourier transform Pif) of the basic pulse p(r) used in a certain binary communication 
system is shown in Fig. P7.3-4. 

(a) From the shape of P(/), ex plain at what pulse rate this pulse would satisfy Nyquist J s criterion. 

(b) Find pit) and verify that this pulse does (or does not) satisfy the Nyquisris criterion. 

(c) If the pulse does satisfy the Nyquist criterion, what is the transmission rate (in bits per 
second) and what is the roll-off factor? 


Figure 



7*3-5 Apulse/?(r) whose spectrum Pif) is shown in Fig. P7.3-5 satisfies Nyquist’s criterion. If ft = 0,8 
MHz and /2 = 1 ^ MHz, determine the maximum rate at which binary data can be transmitted 
by this pulse using Nvquisris criterion. What is the roll-off factor? 

7*3-6 Binary data at a rate of 1 Mbit/s is to be transmitted by using Nyquist criterion pulses with Pif) 
shown in Fig. P7.3-5, The frequencies f\ and fi of the spectrum are adjustable. The channel 
available for the transmission of this data has a bandwidth of 700 kHz. Determine f\ and /2 an ^ 
the roll-off factor. 
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Figure P.7.3-5 



7.3-7 Show that the inverse Fourier transform of P(f) in Eq. (7.39) is indeed second criterion pulse 
p{i) given in Eq. (7.38). 

Hint: Use Eq. (3.32) to find the inverse transform of P(f) in Eq. (7.39) and express sine (x) in 
the form sin x/x. 


73-8 Show that the inverse Fourier transform of P(f) (the raised cosine pulse spectrum in Eq, (7.35) 
is the pulse pit) given in Eq (7,36). 

Hint: Use Eq. (3.32) to find the inverse transform of P(f) in Eq, (7.39) and express sine (jr) in 
the form sin x/x . 


73-9 Show that there exists one (and only one) pulse pit) of bandwidth R b /2 Hz that satisfies the 
criterion of second criterion pulse [Eq. (7,37)]. Show that this pulse is given by 


pit) = {sine {jrRfyt) + sine [ttR b (t - 7),)]} = 


sin (jrRfrt) 
nR b t( \ -R b t) 


and its Fourier transform is P(f) given in Eq. (7,39), 

Hint: For a pulse of bandwidth R b f 2, the Nyquist interval is 1/R b = T b , and the conditions 
(7.37) give the Nyquist sample values at t = ±nT b . Use the interpolation formula [Eq. (6.10)] 
with B = R b /2, T$ — T b to construct p(t). In determining P(f)> recognize that (1 = 

e -]7tfF b i e JjrfT k + e -]JrfT b 2^ 


73-10 In a binary data transmission using duobinary pulses, sample values were read as follows: 

120 - 2 - 200 - 202002000 - 2 


(a) Explain if there is any error in detection, 

(b) If there is no detection error, determine the received bit sequence. 

73-11 In a binary data transmission using duobinary pulses, sample values of the received pulses were 
read as follows: 


12000 -200 — 20200 —20220 - 2 

(a) Explain if there is any error. 

(b) Can you guess the correct transmitted digit sequence? There is more than one possible 
correct sequence* Give as many correct sequences as possible, assuming that more than one 
detection error is extremely unlikely. 


7.4-1 In Example 7.2, when the sequence S = 1010101000001li was applied to the input of the 
scrambler in Fig. 7.20a, the output T was found to be 101110001101001. Verify that when this 
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sequence T is applied to the input of the descrambler in Fig. 7.20b, the output is the original 
input sequence, S = 101010100000111. 

7-4-2 Design a descrambler for the scrambler of Fig. P7.4-2, If a sequence S - 101010100000111. is 
applied to the input of this scrambler, determine the output sequence T, Verify that if this T is 
applied to the input of the descrambler, the output is the sequence S. 


Figure P.7.4-2 



7-4-3 Repeat Prob. 7.4-2 if the scrambler shown in Fig r P7.4-3 is concatenated with the scrambler in 
Fig. P7.4-2 to form a composite scrambler. 


Figure P.7.4-3 


7 * 5-1 In a certain binary communication system that uses Nyquisfs criterion pulses, a received pulse 
Prti) (see Fig. 7.22a) has the following nonzero sample values: 

Pr( 0) = 1 

pATfr) - 0.1 p r ( — T b ) = 0.3 
Pr(2T b ) = -0.02 Pr(-2T b ) = -0.07 

(a) Determine the tap settings of a three-tap. zero-forcing equalizer 

(b) Using the equalizer in part (a), hnd the residual nonzero ISI. 

7*7-1 In a PAM scheme with M = 16: 

(a) Determine the minimum transmission bandwidth required to transmit data at a rate of 12,000 
bits/sec with zero ISI. 

(b) Determine the transmission bandwidth if Nyquist criterion pulses with a roll-off factor 
r = 0.2 are used to transmit data. 

7*7-2 An audio signal of bandwidth 4 kHz is sampled at a rate 25% above the Nyquist rate and 
quantized. The quantization error is not to exceed 0.1% of the signal peak amplitude. The 
resulting quantized samples are now r coded and transmitted by 4-ary pulses. 

(a) Determine the minimum number of 4-ary pulses required to encode each sample. 

(b) Determine the minimum transmission bandwidth required to transmit this data with zero 
ISI. 
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(c) If 4-aiy pulses satisfying Nyquisfs criterion with 25% roil-off are used to transmit this data, 
determine the transmission bandwidth. 

7.7- 3 Binary data is transmitted over a certain channel at a rate R ^ bit/s. To reduce the transmission 

bandwidth, it is decided to use 16-ary PAM signaling to transmit this data. 

(a) By what factor is the bandwidth reduced? 

(b) By what factor is the transmitted power increased, assuming minimum separation between 
pulse amplitudes to be the same in both cases? 

Hint: Take the pulse amplitudes to be ±A/ 2, ±3/1/2, ±5A/ 2, ±7/1/2, - — , ±15A/2 so that 
the minimum separation between various amplitude levels is A (same as that in the binary 
case pulses ±A/ 2). Assume that all 16 levels are equally likely. Recall also that multiplying 
a pulse by a constant k increases its energy &~-fold, 

7.7- 4 An audio signal of bandwidth 10 kHz is sampled at a rate of 24 kHz, quantized into 256 levels 

and coded by means of M -ary PAM pulses satisfying Nyquisles criterion with a roll-off factor 
r = 0.2. A 30 kHz bandwidth is available to transmit the data. Determine the best value of M , 

7.7- 5 Consider a case of binary transmission via polar signaling that uses half-width rectangular pulses 

of amplitudes A/2 and “A/2. The data rate is R^ bit/s. 

(a) What is the minimum transmission bandwidth and the transmitted power, 

(b) This data is to be transmitted by M -ary rectangular half-width pulses of amplitudes 

±.4/2, ±3A/2, ±5A/2.±[(M - 1)/2]A 

Note that to maintain about the same noise immunity, the minimum pulse amplitude sepa¬ 
ration is A. If each of the A/-ary pulses is equally likely to occur, show that the transmitted 
power is 

„ (M 2 -\)A 2 

24 log 2 M 

Also determine the transmission bandwidth. 

7.8- 1 Figure P7.8-1 shows a binary data transmission scheme. The baseband signal generator uses 

full-width pulses and polar signaling. The data rate is 1 Mbit/s. 

(a) If the modulator generates a PSK signal, what is the bandwidth of the modulated output? 

(b) If the modulator generates FSK with the difference f c [ —f c Q = 100 kHz (see Fig. 7.32c), 
determine the modulated signal bandwidth. 


Figure P,7<8-1 



7.8- 2 Repeat Prob, 7.8-1 if, instead of full-width pulses, Nyquisps criterion pulses with r — 0.2 are 

used. 

7.8- 3 Repeat Prob. 7,8-1 if a multi amplitude scheme with M = 4 (PAM signaling with full-width 

pulse) is used. In FSK |Prob. 7.8-1, part (b)], assume that successive amplitude levels are 
transmitted by frequencies separated by 100 kHz, 



Q FUNDAMENTALS OF 
O PROBABILITY THEORY 


T hus far, we have been studying signals whose values at any instant t are determined 
by their analytical or graphical description. These are called deterministic signals, 
implying complete certainty about their values at any moment t . Such signals, which 
can be specified with certainty, cannot convey information. Tt will be seen in Chapter 13 that 
information is inherently related to uncertainty. The higher the uncertainty about a signal (or 
message) to be received, the higher its information content. If a message to be received is 
specified (i.e., if it is known beforehand), then it contains no uncertainty and conveys no new 
information to the receiver. Hence, signals that convey information must be unpredictable. 
In addition to in formation-bearing signals, noise signals that perturb information signals in a 
system are also unpredictable (otherwise they can simply be subtracted). These unpredictable 
message signals and noise waveforms are examples of random processes that play key roles 
in communication systems and their analysis. 

Random phenomena arise either because of our partial ignorance of the generating mech¬ 
anism (as in message or noise signals) or because the laws governing the phenomena may 
be fundamentally random (as in quantum mechanics). Yet in another situation, such as the 
outcome of rolling a die, it is possible to predict the outcome provided we know exactly all 
the conditions: the angle of the throw, the nature of the surface on which it is thrown, the 
force imparted by the player, and so on. The exact analysis, however, is so complex and so 
sensitive to all the conditions that it is impractical to carry it out, and we are content to accept 
the outcome prediction on an average basis. Here the random phenomenon arises from our 
unwillingness to carry out the exact and full analysis because it is impractical to amass all the 
conditions precisely or not worth the effort. 

We shall begin with a review of the basic concepts of the theory of probability, which 
forms the basis for describing random processes. 


8.1 CONCEPT OF PROBABILITY 

To begin the discussion of probability, we must define some basic elements and important terms. 
The tenn experiment is used in probability theory to describe a process whose outcome cannot 
be fully predicted because the conditions under which it is performed cannot be predetermined 
with sufficient accuracy and completeness. Tossing a coin, rolling a die, and drawing a card 
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from a deck are some examples of such experiments. An experiment may have several sepa¬ 
rately identifiable outcomes. For example, rolling a die has six possible identifiable outcomes 
(1,2,3,4,5, and 6). An event is a subset of outcomes that share some common characteristics. 
An event occurs if the outcome of the experiment belongs to the specific subset of outcomes 
defining the event. In the experiment of rolling a die, for example, the event “odd number on a 
throw” can result from any one of three outcomes (viz., 1,3, and 5). Hence, this event is a set 
consisting of three outcomes (1,3, and 5). Thus, events are groupings of outcomes into classes 
among which we choose to distinguish. The ideas of experiment, outcomes, and events form 
the basic foundation of probability theory. These ideas can be better understood by using the 
concepts of set theory. 

We define the sample space S as a collection of all possible and separately identifiable 
outcomes of an experiment. In other words, the sample space S specifies the experiment. Each 
outcome is an element, or sample point, of this space $ and can be conveniently represented 
by a point in the sample space. In the experiment of rolling a die, for example, the sample space 
consists of six elements represented by six sample points £], £2, £3, £ 4 , £ 5 , and £<>, where £ 
represents the outcome “a number / is thrown” (Fig. 8.1). The event, on the other hand, is a 
subset of S> The event “an odd number is thrown,” denoted by A 0 , is a subset of S (or a set of 
sample points £]. £3, and £5). Similarly, the event “an even number is thrown,” is another 
subset of S (or a set of sample points £2, £4, and £>): 

A 0 = (Cl * C3* Cs) Ae = (C2, C4> fft) 

Let us denote the event “a number equal to or less than 4 is thrown” as B. Thus, 


B = (Ch £2* £3* £4) 


These events are clearly marked in Fig. 8.1. Note that an outcome can also be an event, because 
an outcome is a subset of S with only one element. 

The complement of any event A, denoted by A c , is the event containing all points not in 
A. Thus, for the event B in Fig. 8.1, B c = (£ 5 , £ 6 ), A <: 0 — A e > and A c e = A v . An event that has 
no sample points is a null event, which is denoted by 0 and is equal to S c . 

The union of events A and B , denoted by A U B, is the event that contains all points in 
A and B. This is the event stated as having “an outcome of either A or B .” For the events in 
Fig. 8.1, 


A 0 VB = (Su £ 3 , £ 5 f £ 2 , £ 4 ) 
A e \JB = (£ 2 , £ 4 ? £61 £ 1 , £ 3 ) 
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Figure 8.2 

Representation of 

(a) complement, 

(b) union, and 

(c) intersection of 
events. 



S 



(b) 



Observe that the union operation commutes: 

AUB=BUA (8.1) 

The intersection of events A and B, denoted by A fi B or simply byAZ?, is the event that contains 
points common to A and B. This is the event that “outcome is both A and also known as 
the joint event AC\B. Thus, the event A e B, “a number that is even and equal to or less than 4 
is thrown/' is a set £ 4 ), and similarly for A 0 B, 

— (£2* £4) — (Ci * ft) 

Observe that the intersection also commutes 

Ar\B = BC]A (8.2) 

All these concepts can be demonstrated on a Venn diagram (Fig. 8.2). If the events A and 
B are such that 


Af~l£ = 0 (8.3) 

then A and B are said to be disjoint, or mutually exclusive, events. This means events A and 
B cannot occur simultaneously. In Fig. 8.1 events A e and A 0 are mutually exclusive, meaning 
that in any trial of the experiment if A e occurs, A 0 cannot occur at the same time, and vice 
versa. 

Relative Frequency and Probability 

Although the outcome of an experiment is unpredictable, there is a statistical regularity about 
the outcomes. For example, if a coin is tossed a large number of times, about half the times 
the outcome will be “heads,” and the remaining half of the times it will be “tails/' We may 
say that the relative frequency of the two outcomes “heads” or “tails” is one-half* This relative 
frequency represents the likelihood of a particular event. 

Let A be one of the events of interest in an experiment. If we conduct a sequence of N 
independent trials* of this experiment, and if the event A occurs in N (A) out of these N trials, 
then the fraction 


f<A) = 


lim 


N(A) 


N^oo N 


(8-4) 


+ Trials conducted under similar discernible conditions. 
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is called the relative frequency of the event A- Observe that for small A, the fraction A(A)/A 
may vary widely with N. As A increases, the fraction will approach a limit because of statis¬ 
tical regularity. 

The probability of an event has the same connotations as the relative frequency of that 
event. Hence, we estimate the probability of each event, as the relative frequency of that events 
Therefore, to an event A, we assign the probability ^(A) as 

P(A) = lim (8.5) 

jV -*00 A 

From Eq, (8,5), it follows that 

0 < P(A) < 1 (8.6) 


Example 8 A Assign probabilities to each of the six outcomes in Fig. 8.1. 


Because each of the six outcomes is equally likely in a large number of independent trials, 
each outcome will appear in one-sixth of the trials. Hence, 


P«i) = 2 

o 


i = 1, 2, 3, 4, 5, 6 


(8.7) 


Consider now r the two events A and B of an experiment. Suppose we conduct A independent 
trials of this experiment and events A and B occur in A (A) and N(B) trials, respectively. If A 
and B are mutually exclusive (or disjoint), then if A occurs, B cannot occur, and vice versa. 
Hence, the event A U B occurs in A (A) + A{/?) trials and 


P(AUB) 


lim 

jV^oo 


A (A ) + N(B ) 
A 


= P(A) + P(B) if ADB = ® 


( 8 . 8 ) 


This result can be extended to more than two mutually exclusive events. In other words, if 
events {A;} are mutually exclusive such that 


then 


A{ n Aj = 0 i t 



* Observe that we are not defining the probability by the relative frequency. To a given event, a probability is closely 
estimated by the relative frequency of the event when this experiment is repeated many times. Modern theory of 
probability, being a branch ormathematics, starts with certain axioms about probabilily [Eqs. (8. 6 ), (8,8), and (8.11)1. 
It assumes that somehow these probabilities are assigned by nature. We use relative frequency to estimate probability 
because it is reasonable in the sense that it elosely approximates our experience and expectation of 4 * probabi lily.” 
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Example 8*2 Assign probabilities to the events A e , A fJ , B, A e B , and A 0 B in Fig. 8.1. 

Because A e = (£2 U £4 U £5) where £ 2 , £ 4 , and are mutually exclusive, 

P(A e ) = Pfa) + P(&) + P&) 

| From Eq. (8.7) it follows that 


Si 




si? 

re 


P(A e ) = 


1 


(8*9a) 


i Similarly, 


a 

g 


■nfi: 


5 

6 


/>(A.) = 


(8,9b) 

(8.9c) 


® From Fig. 8.1 we also observe that 


A e B = fr U ft 


f, 

I 

I 


and 


/W?) =/>(£>)+ / > « 4 ) 


(8.10a) 


£ 

& 




Similarly, 


P(AoB) = - 


(8.10b) 


We can also show that 


P(S) = 1 (8.11) 

This result can be proved by using the relative frequency* Let an experiment be repeated N 
times (N large). Because S is the union of all possible outcomes, S occurs in every trial. Hence, 
N out of N trials lead to event S , and the result follows. 


Example 8.3 Two dice are thrown. Determine the probability that the sum on the dice is seven. 

I For this experiment, the sample space contains 36 sample points because 36 possible 
outcomes exist* All the outcomes are equally likely* Hence, the probability of each outcome 
is 1/36* 
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1 

$ 

*4 

i; 


A sum of seven can be obtained by the six combinations: (1,6), (2, 5), (3, 4), (4, 3) 
(5, 2), and (6,1), Hence, the event “a seven is thrown’ 7 is the union of six outcomes, each 
with probability 1/36. Therefore, 


P(“a seven is thrown”) = ~ + — 
36 36 


1111 

— -T — + — “(" — 
36 36 36 36 


1 

6 


Example 8.4 


A coin is tossed four times in succession. Determine the probability of obtaining exactly two 
heads. 


A total of 2 4 = 16 distinct outcomes are possible, all of which are equally likely because 
of the symmetry of the situation. Hence, the sample space consists of 16 points, each with 
probability L/16. The 16 outcomes are as follows: 


HHHH 

rm 

HHHT 

TTTH 

HHTH 

TTHT 

HHTT 

► TTHfi 

HTHH 

THTT 

HTHT 

—> THTH 

HTTH 

—► THHT 

HTTT 

THHH 


Six out of these 16 outcomes lead to the event “obtaining exactly two heads” (arrows). 
Because all of the six outcomes are disjoint (mutually exclusive), 

^(obtaining exactly two heads) = — = - 


in Example 8.4, the method of listing all possible outcomes quickly becomes unwieldy 
as the number of tosses increases. For example, if a coin is tossed just 10 times, the total 
number of outcomes is 1024. A more convenient approach would be to apply the results of 
combinatorial analysis used in Bernoulli trials, to be discussed shortly. 

Conditional Probability and Independent Events 

Conditional Probability: It often happens that the probability of one event is influenced 
by the outcome of another event. As an example, consider drawing two cards in succession 
from a deck. Let A denote the event that the first card drawn is an ace. We do not replace the card 
drawn in the first trial. Let B denote the event that the second card drawn is an ace. It is evident 
that the probability of drawing an ace in the second trial will be influenced by the outcome of 
the first draw. If the first draw does not result in an ace, then the probability of obtaining an ace 
in the second trial is 4/51. The probability of event B thus depends on whether event A occurs. 
We now introduce the conditional probability PyJIA) to denote the probability of event B 
when it is known that event A has occurred. P(B\A) is read as “probability of B given A.” 
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Let there be N trials of an experiment, in which the event A occurs n\ limes. Of these n\ 
trials, event B occurs /12 times. It is clear that ti 2 is the number of times that the joint event 
AHB (Fig* 8.2c) occurs. That is, 

P(AnS)= lim ('*)= lim 

n-*oo\N/ n^ck\nJ \nij 

Note thatlim/v^cx)(fli/N) = F(A). Also, lim J v_oo(^2/^i) = F(#|A),* because B occurs 
n 2 of the n\ times that A occurred. This represents the conditional probability of B given A* 
Therefore, 


P{AC\B) = P(A)P{B\A) 


( 8 - 12 ) 


and 


P(B\A) = 


P(ADB) 

P(A) 


provided P(A) > 0 


Using a similar argument, we obtain 


P(A\B) = 


P(A n B) 


provided P(B) > 0 


(8.13a) 


(8.13b) 


It follows from Eqs. (8A3) that 


/wj = 

J P(B) 

P(B)P(A\B) 

*W> = LI 


(8.14a) 

(8.14b) 


Equations (8.14) are called Bayes’rule. In Bayes'rule, one conditional probability is expressed 
in terms of the reversed conditional probability. 


Example 8*5 An experiment consists of drawing two cards from a deck in succession (without replacing the 

first card drawn). Assign a value to the probability of obtaining two red aces in two draws. 

Let A and B be the events “red ace in the first draw” and “red ace in the second draw ” 
respectively. We wish to determine P{A n B ), 

P(A D B) = P(A)P(B\A) 

and the relative frequency of A is 2/52 = 1/26* Hence, 

= Te 


* Here we are implicitly using the fact that rq -> 00 as N so. This is true provided the ratio 
lim^ooOzi/N) 7 ^ 0, that is, if P(A) ^ 0. 
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Also, ffSIA) is the probability of drawing a red ace in the second draw given that the first 
draw was a red ace. The relative frequency of this event is 1/51, so 

P(m = j; 


Hence, 


P(A fl B) = 



1 

1326 


Independent Events: Under conditional probability, we presented an example where 
the occurrence of one event was influenced by the occurrence of another There are, of course, 
many examples in which two or more events are entirely independent; that is, the occurrence 
of one event in no way influences the occurrence of the other event. As an example, we again 
consider the drawing of two cards in succession, but in this case we replace the card obtained 
in the first draw and shuffle the deck before the second draw. In this case, the outcome of the 
second draw is in no way influenced by the outcome of the first draw. Thus P(B ), the probability 
of drawing an ace in the second draw, is independent of whether the event A (drawing an ace 
in the first trial) occurs. Thus, the events A and B are independent. The conditional probability 
P(£|A) is given by P(B). 

The event B is said to be independent of the event A if and only if 

P(AHB) =P(A)P(B) (8.15a) 

Note that if the events A and B are independent, it follows from Eqs. (8.13a) and (8.15b) that 

P(B\A) = P(B) (8.15b) 

This relationship states that if B is independent of A, then its probability is not affected by the 
event A ♦ Naturally, if event B is independent of event A, then event A is also independent of B . 
It can been seen from Eqs. (8 + 14) that 


P(A\B) =P(A) (8.15c) 

Note that there is a huge difference between independent events and mutually exclusive 
events, if A and B are mutually exclusive, then A C\B is empty and P(A n 8) — 0. If A and B 
are mutually exclusive, then A and B cannot occur at the same time. This clearly means that 
they are NOT independent events. 

Bernoulli Trials 

In Bernoulli trials, if a certain event A occurs, we call it a “success.” If P(A) = p , then the 
probability of success is p. If q is the probability of failure, then q = 1 - p. We shall find the 
probability of k successes in n (Bernoulli) trials. The outcome of each trial is independent of 
the outcomes of the other trials. It is clear that in n trials, if success occurs in k trials, failure 
occurs in n — k trials. Since the outcomes of the trials are independent, the probability of this 
event is clearly p n { 1 - p ) n ~ k f that is, 

P(k successes in a specific order in n trials) =//'(! - p) n ~ k 
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But the event of “k successes in n trials” can occur in many different ways (different orders). 
It is well known from combinatorial analysis that there are 

/n\ rc! 

(*) = (8 ' 16) 

ways in which k positions can be taken from n positions (which is the same as the number of 
ways of achieving k successes in n trials). 

This can be proved as follows. Consider an urn containing n distinguishable balls marked 
1,2 , y n. Suppose we draw k balls from this urn without replacing them. The first ball could 
be any one of the n balls, the second ball could be any one of the remaining (n - 1) balls, and 
so on. Hence, the total number of ways in which k balls can be drawn is 

n{n - 1 )(h - 2 )... (n - k + 1) = 

(n - k)\ 


Next, consider any one set of the k balls drawn. These balls can be ordered in different ways. 
We could label any one of the k balls as number 1, and any one of the remaining (k — 1) balls 
as number 2, and so on. This will give a total of k(k — l)(fc — 2} ■ ■ ■ 1 = k\ distinguishable 
patterns formed from the k balls. The total number of ways in which k things can be taken 
from n things is n\/(n — £)! But many of these ways will use the same k things, arranged in 
different order. The ways in which k things can be taken from n things without regard to order 
(unordered subset k taken from n things) is n\f{n- k)\ divided by k \ This is precisely defined 
by Eq. (8.16). 

This means the probability of k successes in n trials is 


P(k successes in n trials) = p k { 1 — p) n 


k\(n — k)l 


p k n-p) n - k 


(8.17) 


Tossing a coin and observing the number of heads is a Bernoulli trial with p = 0.5. Hence, the 
probability of observing k heads in n tosses is 


P(k heads in n tosses) = (O>5/(0 + 5)' T k 


«! 

k\(n-k)\ 


(0.5)* 


Example 8.6 A binary symmetric channel (BSC) has an error probability /^(i.e., the probability of receiving 
0 when 1 is transmitted, or vice versa, is P e )> Note that the channel behavior is symmetrical 
with respect to 0 and 1. Thus, 


pmi) = pam = Pe 


and 


P(0|0) -/W) = 1 ~Pe 

where P(y|;r) denotes the probability of receiving y when x is transmitted. A sequence of n 
binary digits is transmitted over this channel. Determine the probability of receiving exactly 
k digits in error. 
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The reception of each digit is independent of the other digits. This is an example of a 
Bernoulli trial with the probability of success p = P e {“success*’ here is receiving a digit 
in error). Clearly, the probability of k successes in rc trials {£ errors in n digits) is 


^(receiving k out of n digits in error) = (^) ^(1 — Pe) n 


For example, if P € = 10 5 , the probability of receiving two digits wrong in a sequence 
of eight digits is 


Q) 0°~V<1 - 10- 5 ) 6 ~ ^yio- 10 


( 2 . 8 ) 10 -9 


Example 8.7 PCM Repeater Error Probability 

In pulse code modulation, regenerative repeaters are used to detect pulses (before they are lost 
in noise) and retransmit new, clean pulses. This combats the accumulation of noise and pulse 
distortion. 

A certain PCM channel consists of n identical links in tandem (Fig. 8.3). The pulses are 
detected at the end of each link and clean new pulses are transmitted over the next link. If P e 
is the probability of error in detecting a pulse over any one link, show that Pe, the probability 
of error in detecting a pulse over the entire channel (over the n links in tandem), is 

Pg — nP e nP e <K 1 


Figure 8.3 IstJink 2st link 

A PCM repeater. In •- * -•-►-*■ 


&th link 


Out 


The probabilities of detecting a pulse correctly over one link and over the entire channel 
(n links in tandem) are 1 — P e and 1 - Pe, respectively. A pulse can be detected correctly 
over the entire channel if either the pulse is detected correctly over every link or errors 
are made over an even number of links only. 


1 — Pe ■ = ^(correct detection over all links) 

4- Pferror over two links only) 

-h P(error over four links only) H- 


+ terror over 2 



links only) 


where |_«J denotes the largest integer less than or equal to a ♦ 

Because pulse detection over each link is independent of the other links (see Example 8.6), 


Pfcorrect detection over all n links) — (1 — P € f 


and 


Fferror over k links only) = 


n! 

*!(*-*)! 


p k e {\ 


~Pe) 


n-k 





8.1 Concept of Probability 403 


Hence, 


1 -P E = (1 -Pe) n + 


£= 2 , 4 , 6 , 


k\(n-k)\ 


pU\ - p e y~ k 


In practice, P e <K 1, so only the first two terms on the right-hand side of this equation are 
of significance. Also, (1 — Pf,) n ~ k ~ l, and 

1 -P E ~(l-P e )' l + 


2!(rt — 2)! 


(1 -P e ) n + V 2 


If nP e 1, then the second term can also be neglected, and 

i-p E ^(\-p e r 

^ 1 — nP € nP e <£ 1 


and 


$ 


Pe — n Pe 


We can explain this result heuristically by considering the transmission of TV (N —> oo) 
pulses. Each link makes NP £ errors, and the total number of errors is approximately nNP e 
(approximately, because some of the erroneous pulses over a link will be erroneous over 
other links). Thus the overall error probability is nP £ * 


Example 8.8 


. : In binary communication, one of the techniques used to increase the reliability of a channel 

is to repeat a message several times. For example, we can send each message (0 or 1) 
% three times. Hence, the transmitted digits are 000 (for message 0) or 111 (for message 
4 1). Because of channel noise, we may receive any one of the eight possible combinations 

'A of three binary digits. The decision as to which message is transmitted is made by the 
majority rule; that is, if at least two of the three detected digits are 0, the decision is 0, and 
so on. This scheme permits correct reception of data even if one out of three digits is in 
I;;' error. Detection error occurs only if at least two out of three digits are received in error. 
p If P £ is the error probability of one digit, and P(t) is the probability of making a wrong 
it. decision in this scheme, then 


$ 


In practice, P p 1, and 


% 


3 

fc =2 ' ’ 

= 3/^(1 - Pe ) + p\ 


P(€) * 3 P 2 e 
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I For instance, if/ > e = 10 4 . P{€) — \ 10 8 . Thus, the error probability is reduced 

from 10 -4 to 3 x 10 -8 . We can use any odd number of repetitions for this scheme to 
j| function. 

In this example, higher reliability is achieved at the cost, of a reduction in the rate of 
information transmission by a factor of 3. We shall see in Chapter 14 that more efficient 
ways exist to effect a trade-off between reliability and the rate of transmission through the 
use of error correction codes. 


Multiplication Rule for Conditional Probabilities 

As shown in Eq. (8.12), we can write the joint event 


P(Anfi) = P(A)P(B/A) 

This rule on joint events can be generalized for multiple events A], A?.A„ via iterations. 

If A 1 A 2 ■ ■ ■ A„ 0, then we have 


P{A i A 2 ---A n ) 


P{AyA 2 ---A n ) 


P(AiA 2 ■ ■ ■ A h _]) 


= P{A n \A\A 2 ■ ■ ■ A„_ 


P(A\A 2 ■ ■ • A„_ 1 ) F(AiA2) 

' F(AiA 2 ---A h _ 2 ) P(Ai) * ( ° 


(8.18a) 


l) ■ P(A„_| |AjA2 • ■ ■ A„_2) ■ ■ ■ P(A 2 |Ai) ■ P(Aj) 

(8.18b) 


Note that since A iA 2 ■ • ■ A„ ^ 0, every denominator in Eq. (8.18a) is positive and well defined. 


Example 8.9 Suppose a box of diodes consist of N s good diodes and N h bad diodes. If five diodes are 
randomly selected, one at a time, without replacement, determine the probability of obtaining 
the sequence of diodes in the order of good , bad, good, good, bad. 

Y We can denote G\ as the event that the ith draw is a good diode. We are interested in the 
l| event of G] G) Gj G„G(. 

I P(GiG c 2 G 3 G A G$) =P(G 1 )P(G c 2 \Gi)P(G 3 {GiG$)P(G4\G 1 G c 2 G 3 )P(GI\GiG c 2 G 3 G 4 ) 

_ N b N g — 1 N g - 2 

N g +N h ' Ng +N b - 1 ' N h + N g - 2 ' N g +N b -3 
Nb-1 

' N g + N b - 4 


l 


To Divide and Conquer: The Total Probability Theorem 

In analyzing a particular event of interest, sometimes a direct approach to evaluating its prob¬ 
ability can be difficult because there can be so many different outcomes to enumerate. When 
dealing with such problems, it is often advantageous to adopt the divide-and-conquer approach 
by separating all the possible causes leading to the particular event of interest B * The total 
probability theorem provides a perfect tool for analyzing the probability of such problems. 

We define S as the sample space of the experiment of interest. As shown in Fig* 8*4, the 
entire sample space can be partitioned into rc disjoint events A[, * * *, A n . We can now state the 
theorem: 
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Total Probability Theorem: Let n disjoint events A i, . * *, A n form a partition of the 
sample space 5 such that 


H 

|^Ja ; — S and A, Pi Ay = 0, if £ ^ j 

i= I 


Then the probability of an event B can be written as 


P(B) = '£ / P(B\A i )P(A i ) 

i=\ 

Proof: The proof of this theorem is quite simple based on Fig. 8A Since {A/} form a partition 
of l?, then 


B^BDS = Bn(A } UA 2 U ■ ■ ■ U A„) 

= (AiB)U (A 2 B)U - - >\J (A n B) 

Because {A e } are disjoint, so are {AjB}. Thus, 

n n 

P(B) = 5^P(A,-B) = J^PiB\Ai)P(Ai) 

i= 1 i-l 

This theorem can simplify the analysis of the more complex event of interest B by iden¬ 
tifying all different causes A t for B. By quantifying the effect of A; on B through P(£[A;), the 
theorem allows us to “divide-and-conquer" a complex problem (of event B). 


Example 8J0 The decoding of a data packet may be in error because of N distinct error patterns 
E\, E 2y - -., E n it encounters. These error patterns are mutually exclusive, each with prob¬ 
ability P(Ej) = pi. When the error pattern occurs, the data packet would be incorrectly 
decoded with probability q\. Find the probability that the data packet is incorrectly decoded. 

We apply total probability theorem to tackle this problem. First, define B as the event that 
'£ the data packet is incorrectly decoded. Based on the problem, we know that 


P{B\Ei) = qi and P{E{) = p t 
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Furthermore, the data packet has been incorrectly decoded, Therefore 

X> = 1 

1=1 

Applying the total probability theorem, we find that 


A(5) = £>(B|E ; )/>(E;) = £ 


<iiPi 


Isolating a Particular Cause: Bayes’ Theorem 

The total probability theorem facilitates the probabilistic analysis of a complex event by using a 
divide-and-conquer approach. In practice, it may also be of interest to determine the likelihood 
of aparticularcause of an event among many disjoint possible causes. Bayes’theorem provides 
the solution to this problem. 

Bayes’ Theorem: Let n disjoint events A i. A n form a partition of the sample space 

S. Let B be an event with P(B) > 0. Then for j — 1. n, 

P(A\B) = P(B ^ )P(A J ) = p ( B \ A j)P(Aj) 

} Zli p (m)P(Ai) 

The proof is already given by the theorem itself. 

Bayes' theorem provides a simple method for computing the conditional probability of Aj 
given that B has occurred. The probability P{A } \B) is often known as the posterior probability 
of event Aj. It describes, among n possible causes of B , the probability that B may be caused 
by Aj . In other words, Bayes' theorem isolates and finds the relative likelihood of each possible 
cause to an event of interest. 


Example 8.1 1 A communication system always encounters one of three possible interference waveforms: 

Fu F 2 , or F 3 . The probability of each interference is 0.8, 0.16, and 0.04, respectively. The 
communication system fails with probabilities 0.01, 0.1, and 0.4 when it encounters F\, F 2 , 
and F 3 , respectively. Given that the system has failed, find the probability that the failure is a 
result of Fi, F 2 , or F 3 , respectively. 

Denote B as the event of system failure. We know from the description that 

F(Fi) = 0.8 P(F 2 ) = 0.16 P(F 3 ) = 0.04 

Furthermore, the effect of each interference on the system is given by 

P(B\F0 = 0.01 P(B\F 2 ) = 0.1 P(B\F 3 ) = 0.4 

Now following Bayes' theorem, we find that 

POF, |fi) = _ -(0.01)(0.8)_ 

ELi P(B\Fi)P(Fi) (0.01)(0.8) 4- (0.1)(0.1 6) + (0.4)(0.04) 
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P{Fi\B) 

P{Fi\B) 


p{b\f 2 )P(f 2 ) _ Q i 

Eli P(m)P(Fi) 
P(B\F3)P(Fi) q 1 
Eli P(B\Fi)P(Fi) 


Example 8.11 illustrates the major difference between the posterior probability P(Fi\B) 
and the prior probability P(Ft). Although the prior probability P(F$) = 0.04 is the lowest 
among the three possible interferences, once the failure events has occurred, P(F 3 |#) = 0.4 is 
actually one of the most likely events. Bayes’ theorem is an important tool in communications 
for determining the relative likelihood of a particular cause to an event. 

Axiomatic Theory of Probability 

The relative frequency definition of probability is intuitively appealing. Unfortunately, it has 
some serious mathematical objections. Logically there is no reason why we should get the same 
estimate of the relative frequency whether we base it on 10,000 trials or on 20. Moreover, in 
the relative frequency definition, it is not clear when and in what mathematical sense the limit 
in Eq. (8.5) exists. If we consider a set of an infinite number of trials, we can partition such 
a set into several subsets, such as odd and even numbered trials. Each of these subsets (of 
infinite trials each) would have its own relative frequency. So far, all the attempts to prove that 
the relative frequencies of all the subsets are equal have been futile. 1 There are some other 
difficulties also. For instance, in some cases, such as Julius Caesar having visited Great Britain, 
it is an experiment for which we cannot repeat the event an infinite number of trials. Thus, we 
can never know the probability of such an event. We, therefore, need to develop a theory of 
probability that is not tied down to any particular definition of probability. In other words, we 
must separate the empirical and the formal problems of probability. Assigning probabilities to 
events is an empirical aspect, and setting up purely formal calculus to deal with probabilities 
(assigned by whatever empirical method) is the formal aspect. 

it is instructive to consider here the basic difference between physical sciences and 
mathematics. Physical sciences are based on inductive logic, while mathematics is strictly 
a deductive logic. Inductive logic consists of making a large number of observations and then 
generalizing, from these observations, laws that will explain these observations. For instance, 
history and experience tell us that every human being must die someday. This leads to a law 
that humans are mortals . This is inductive logic. Based on a law (or laws) obtained by induc^ 
tive logic, we can make further deductions. The statement “John is a human being, so be must 
die some day” is an example of deductive logic. Deriving the laws of the physical sciences 
is basically an exercise in inductive logic, whereas mathematics is pure deductive logic. Tn a 
physical science we make observations in a certain field and generalize these observations into 
laws such as Ohm’s law, Maxwell’s equations, and quantum mechanics. There are no other 
proofs for these inductively obtained laws; they are found to be true by observation. But once 
we have such inductively formulated laws (axioms or hypotheses), by using thought process, 
we can deduce additional results based on these basic laws or axiotns alone. This is the proper 
domain of mathematics. All these deduced results have to be proved rigorously based on a set of 
axioms. Thus, based on Maxwell’s equations alone, we can derive the laws of the propagation 
of electromagnetic waves. 

This discussion shows that the discipline of mathematics can be summed up in one apho¬ 
rism: “This implies that.” In other words, if we are given a certain set of axioms (hypotheses), 
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then, based upon these axioms alone, what else is true? As Bertrand Russell puts it: “Pure math¬ 
ematics consists entirely of such asseverations as that, if such and such proposition is true of 
anything, then such and such another proposition is true of that thing.” Seen in this light, it may 
appear that assigning probability to an event may not necessarily be the responsibility of the 
mathematical discipline of probability. Under mathematical discipline, we need to start with 
a set of axioms about probability and then investigate what else can be said about probability 
based on this set of axioms alone. We start with a concept (as yet undefined) of probability 
and postulate axioms. The axioms must be internally consistent and should conform to the 
observed relationships and behavior of probability in the practical and the intuitive sense. It 
is beyond the scope of this book to discuss how these axioms are formulated. The modern 
theory of probability starts with Eqs.(8.6), (8*8), and (8.11) as its axioms. Based on these three 
axioms alone, what else is true is the essence of modern theory of probability. The relative 
frequency approach usesEq* (8.5) to define probability, and Eqs. (8*5), (8.8), and (8.11) follow 
as a consequence of this definition. In the axiomatic approach, on the other hand, we do not 
say anything about how we assign probability P(A) to an event A; rather, we postulate that the 
probability function must obey the three postulates or axioms in Eqs* (8.6), (8.8), and (8T1). 
The modern theory of probability does not concern itself with the problem of assigning prob¬ 
abilities to events* It assumes that somehow the probabilities were assigned to these events a 
priori. 

It a mathematical model is to conform to the real phenomenon, we must assign these 
probabilities in away that is consistent with an empirical and an intuitive understanding of 
probability. The concept of relative frequency is admirably suited for this* Thus, although we 
use relative frequency to assign (not define) probabilities, it is all under the table, not a part of 
the mathematical discipline of probability. 


8.2 RANDOM VARIABLES 

The outcome of an experiment may be a real number (as in the case of rolling a die), or it 
may be nonnumerical and describable by a phrase (such as “heads” or “tail” in tossing a coin). 
From a mathematical point of view, it is simpler to have numerical values for all outcomes. 
For this reason, we assign a real number to each sample point according to some rule* If there 
are m sample points <T, ft, - *., then using some convenient rule, we assign a real number 
x(?i) to sample point f,- (i = 1,2, ... In the case of tossing a coin, for example, we may 
assign the number 1 for the outcome heads and the number -1 for the outcome tails (Fig* 8*5). 

Thus, x(-) is a function that maps sample points ft, into real numbers 

x\ y x 2 < *.. f x„.* We now have a random variable x that takes on values xu x 2f .. x n . 
We shall use roman type (x) to denote a random variable (RV) and italic type (e.g*, 
* 1 , X 2 * ■ ■ ■ ,x n ) to denote the value it takes. The probability of an RV x taking a value Xi 
is P x (xi) = Probability of "x = x, ” 


Discrete Random Variables 

A random variable is discrete it there exists a denumerable sequence of distinct numbers jq 
such that 


(8.19) 


The number m is not necessarily equal to n . More than one sample point can map into one value of x. 
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Figure 8*5 

Probabilities in 
a coin-tossing 
experiment. 


P,(*) 

0.5 


Thus, a discrete RV can assume only certain discrete values. An RV that can assume any value 
over a continuous set is called a continuous random variable. 


Example 8*12 Two dice are thrown* The sum of the points appearing on the two dice is an RV x. Find the 
values taken by x, and the corresponding probabilities. 

/; We see that x can take on all integral values from 2 through 12. Various probabilities can 
^ be determined by the method outlined in Example 8.3. 

$ There are 36 sample points in all, each with probability 1/36. Dice outcomes for 
H various values of x are shown in Table 8.1. Note that although there are 36 sample points, 
^ they all map into 11 values of x. This is because more than one sample point maps into 
Sj the same value of x. For example, six sample points map into x = 7* 

S The reader can verify that Yl\=2 = 1* 

| TABLE 8.1 

£ - 

H 

Value of xi Dice Outcomes P x ( Xi ) 

. 


% 

as 

% 

2 

(l.D 

1/36 

3 

(1,2). (2, 1) 

2/36 = 1/18 


4 

(1,3). (2. 2), (3, 1) 

3/36= 1/12 

% 

5 

(1,4), (2. 3). (3, 2), (4, 1) 

4/36 = 1/9 

ii 

6 

(1,5), (2, 4), (3, 3), (4, 2), (5, 1) 

5/36 

1 

7 

(1,6), (2, 5), (3, 4), (4, 3), (5,2), (6,1) 

6/36=1/6 

$ 

8 

(2, 6), (3,5), (4, 4), (5, 3), (6, 2) 

5/36 


9 

(3, 6), (4, 5), (5, 4), (6, 3) 

4/36 = 1/9 


10 

(4, 6), (5,5), (6,4) 

3/36 = 1/12 

% 

■St 

11 

(5, 6), (6, 5) 

2/36= 1/18 

12 

(6,6) 

1/36 


'iff 


The preceding discussion can be extended to two RVs, x and y. The joint probability 
P X yOti, yj) is the probability that “x = Xi and y = yj” Consider, for example, the case of a 
coin tossed twice in succession. If the outcomes of the first and second tosses are mapped into 
RVs x and y, then x and y each takes values 1 and — 1. Because the outcomes of the two tosses 
are independent, x and y are independent, and 


P _V/} — />x(T) 
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and 


= l) = f\ y (-l, -1)=| 

These probabilities are plotted in Fig. 8.6. 

For a general case where the variable x can take values jci, xi . x n and the variable y 

can take values yj, y 2 .y m , we have 

yj' } = 1 ( 8 - 20 ) 

< i 

This follows from the fact that the summation on the left is the probability of the union of all 
possible outcomes and must be unity (a certain event). 

Conditional Probabilities 

If x and y are two RVs, then the conditional probability of x = Xi given y = yj is denoted by 
Px|ytel»)- Moreover, 


- 1 (8.21) 

< 3 

This can be proved by observing that probabilities ix. |y ; ) are specified over the sample 
space corresponding to the condition y = yj. Hence, P A y ('y ; ) is the probability of the 
union of all possible outcomes of x (under the condition y = v r ) and must be unity (a certain 
event). A similar argument applies to Yij p y\\ O'yUi)- Also from Eq. (8.12), we have 


Pxy( x i, yj) — ^x|y— Py\x(yj |jri )P X te) ( 8 . 22 ) 

Bayes’ rule follows from Eq. (8.22) + Also from Eq, (8.22), we have 

p *y {Xi * ^) = p *\y fa \yj) p y(yj) 

i i 

- p y(yi)^ p x\y(Xi\yj) 


= p y(yj) 


(8.23a) 
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Similarly, 


j 


(8.23b) 


The probabilities P x (x;) and P y (yj) are called marginal probabilities. Equations (8.23) show 
how to determine marginal probabilities from joint probabilities. Results of Eqs. (8.20) through 
(8.23) can be extended to more than two RVs. 


Example 8.1 3 A binary symmetric channel (BSC) error probability is P e . The probability of transmitting 1 
is Q , and that of transmitting 0 is 1 - Q (Fig* 8*7). Determine the probabilities of receiving 1 
and 0 at the receiver* 


Figure 8.7 

Binary symmetric 
channel (BSC], 





Similarly, 

p y (0) = (i-fi)(i-P,) + GP, 

These answers seem almost obvious from Fig* 8.7* 

Note that because of channel errors, the probability of receiving a digit 1 is not the 
same as that of transmitting 1. The same is true of 0. 
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Example 8.14 Over a certain binary communication channel, the symbol 0 is transmitted with probability 0.4 
and 1 is transmitted with probability 0.6. It is given that F(e|0) = 10 -6 and />(e|l) = I0 -4 , 
where is the probability of detecting the error given that x, is transmitted. Determine 

Pif i. the error probability of the channel. 

If P(€, Xj) is the joint probability that jc, is transmitted and it is detected wrongly, then the 
total probability theorem yields 

P(t) = £>(e 

i 

= p x (0)P(€m+ p x (\)p(€\d 

= 0.4(10 _f> ) + 0.6( 1 O'" 4 ) 

= 0.604(10 -4 ) 

Note that F(e[0) = 10 -6 means that on the average, one out of 1 million received 
Os will be detected erroneously. Similarly, P(€|l) = 10 -4 means that on the average, one 
out of 10,000 received Is will be in error, But P(e) = 0.604(1CT 4 ) indicates that on the 
average, one out of 1/0.604(H) -4 ) cx 16,556 digits (regardless of whether they are Is 
or 0s) will be received in error. 


Cumulative Distribution Function 

The cumulative distribution function (CDF) F x (*) of an RV x is the probability that x takes 
a value less than or equal to that is, 


F x (x) = P(x < x) (8.24) 

We can show that a CDF F x (jc) has the following four properties: 


l.fxC*) >0 

(8.25a) 

2.F*(oo) = 1 

(8.25b) 

3.F x (-oo) =0 

(8.25c) 

4. F x (x) is a nondecreasing function, that is, 

(8.25d) 

Fx(-Vi) < F x (X 2 ) for jci < X 2 

(8.25e) 


The first property is obvious. The second and third properties are proved by observing that 
^x(oo) — F(x < oo) and F x (— oo) — P(x < -oo). To prove the fourth property, we have, 
from Eq. (8.24), 


Fk(x 2 ) = P(* < X2) 

= Pl(x < X\) u (j:i < x < jcj)] 
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Because x < Jtj and x\ < x < x 2 are disjoint, we have 

F*(x 2 ) = P(x < X\) +P(x\ < x < x 2 ) 

= F X (X]) + P(x] <x<x 2 ) (8,26) 

Because F(jq < x < x 2 ) is nonnegative, the result follows. 


Example 8*15 In an experiment, a trial consists of four successive tosses of a coin. If we define an RV x as 
the number of heads appearing in a trial, determine F x (x) and F x (x), 

I Atotal of 16 distinct equiprobable outcomes are listed in Example 8.4* Various probabilities 
can be readily determined by counting the outcomes pertaining to a given value of x* For 
example, only one outcome maps into x=Q, whereas six outcomes map into x=2* Hence, 
P x { 0) — 1/16 and PA 2) — 6/16. Tn the same way, we find 

P x(0) = F x (4) = 1/16 
F*(l) = P*(3) = 4/16 = 1/4 
PA 2) = 6/16 = 3/8 

The probabilities F x (x;) and the corresponding CDF F x (x,) are shown in Fig* 8.8* 


Figure 8.8 

(a) Probabilities 
Pitfjc/) and 

(b) the 
cumulative 
distribution 
function (CDF), 



Continuous Random Variables 

A continuous RV x can assume any value in a certain interval. In a continuum of any range, an 
uncountably infinite number of possible values exist, and F x (x,), the probability that x = x,, 
as one of the uncountably infinite values, is generally zero* Consider the case of a temperature 
T at a certain location* We may suppose that this temperature can assume any of a range of 
values. Thus, an infinite number of possible temperature values may prevail, and the probability 
that the random variable T will assume a certain value T t is zero* The situation is somewhat 
similar to that described in Sec* 3.1 in connection with a continuously loaded beam (Fig. 3*5b). 
There is a loading along the beam at every point, but at any one point the load is zero. The 
meaningful measure in that case was the loading (or weight) not at a point, but over a finite 
interval. Similarly, for a continuous RV, the meaningful quantity is not the probability that 
x = Xi but the probability that x < x < x -f Ax, For such a measure, the CDF is eminently 
suited because the latter probability is simply F x (x + Ax) — F x (x) [see Eq* (8*26)]. Hence, we 
begin our study of continuous RVs with the CDF* 
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Properties of the CDF [Eqs. (8.25) and (8.26)] derived earlier are general and are valid 
for continuous as well as discrete RVs. 

Probability Density Function: From Eq. (8.26), we have 

Ax (a- 4- Ax) = A x (x) + P(x < x < x + Ax) (8.27a) 

If Ax -► 0, then we can also express F x i.v — Ax) via Taylor expansion as 

F x (x + Ax) ~ Ax (-0 + dFx ( . -l Ax (8.27b) 

ax 

From Eqs. (8.27), it follows that as Ax ->■ 0, 

dF x (x) 

^ Ax = P(x < x < x + Ax) (8,28) 

We designated the derivative of A x (x) with respect tox by p x (x) (Fig. 8.9), 

dA x (x) 

-3T=^ w < 8 ' 29 > 

The function p x (x) is called the probability density function (PDF) of the RV x. it follows 
from Eq. (8,28) that the probability of observing the RV x in the interval (x y x + Ax) is 
PxMAx (Ax —> 0). This is the area under the PDF£ x (x) over the interval Ax, as shown in 
Fig. 8 + 9b + 

Figure 8.9 

(a) Cumulative 
distribution 
function (CDF), 

(b) Probability 
density function 
(PDF). 
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From Eq. (8.29), we can see that 


= 



pAu)du 


Here we use the fact that F x (-oo) — 0, We also have from Eq. (8.26) 


(8.30) 


P(x\ < x < xi) = F x {x 2 ) - F x (x\) 


f p x (x) dx 

JX[ 


px(x) dx 


(8.31) 


Thus, the probability of observing x in any interval (x\ t * 2 ) is given by the area under the 
PDF p x (x) over the interval Qq, * 2 )* as shown in Fig. 8.9b. Compare this with a continuously 
loaded beam (Fig. 3.5b), where the weight over any interval was given by an integral of the 
loading density over the interval. 

Because F x (oc) = 1, we have 


/ oo 

p x (x)dx = 1 (8.32) 

-30 

This also follows from the fact that the integral in Eq. (8.32) represents the probability of 
observing x in the interval (—oo, 00 ). Every PDF must satisfy the condition in Eq. (8.32). It 
is also evident that the PDF must not be negative, that is, 

PA*) > 0 

Although it is true that the probability of an impossible event is 0 and that of a certain event 
is 1, the converse is not true. An event whose probability is 0 is not necessarily an impossible 
event, and an event with a probability of 1 is not necessarily a certain event. This may be 
illustrated by the following example. The temperature T of a certain city on a summer day is 
an RV taking on any value in the range of 5 to 50°C. Because the PDF pj(T) is continuous, the 
probability that T = 34.56, for example, is zero. But this is not an impossible event. Similarly, 
the probability that T takes on any value but 34.56 is 1, although this is not a certain event. 
In fact, a continuous RV x takes every value in a certain range. Yet /? x W* the probability that 
x = x, is zero for every x in that range. 

We can also determine the PDF pAA for a discrete random variable. Because the CDF 
F x (jc) for the discrete case is always a sequence of step functions (Fig. 8.8), the PDF (the 
derivative of the CDF) will consist of a train of positive impulses. If an RV x takes values 
* 1 * X 2 , -. ■, x fi with probabilities a], 02 , -. ■ 1 a n , respectively, then 

F x (x) - G\u{x-x 1 ) 4- a 2 u(x — X 2 ) -i - b a n u(x-x n ) (8.33a) 

This can be easily verified from Example 8.15 (Fig. 8.8). Hence, 

pAx) = a\ &(x-x\) + a 2 S(x -x 2 ) + ♦ ■ + a n S(x - x n ) 

n 

= a r S(x — x r ) 
r= 1 


(8.33b) 
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It is, of course, possible to have a mixed case, where a PDF may have a continuous part and 
an impulsive part {see Prob. 8*2-4), 

The Gaussian Random Variable 

Consider a PDF (Fig. 8.10) 




1 

V2tz 


(8.34) 


This is a case of the well-known standard Gaussian, or normal, probability density* it has 
zero mean and unit variance. This function was named after the famous mathematician Carl 
Friedrich Gauss* 

The CDF F x (jc) in this case is 


F x <*) = -2= f e~ x ^ 2 dx 

\IlJI 


Figure 8.10 

(a] Gaussian 
PDF. (b) Function 
£(>). (c) CDF of 
the Gaussian 
PDF 





(c) 
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This integral cannot be evaluated in a dosed form and must be computed numerically. It is 
convenient to use the function {?(.), defined as 2 

Q(y) = j e~ x ' /2 dx (8.35) 

The area under p x (jr) from v to oo (shaded in Fig. 8.10a) is* Q(y). From the symmetry of/?* (x) 
about the origin, and the fact that the total area under p x (x) = 1, it follows that 


Q(-y) = i - £3(y) 


(8.36) 


Observe that for the PDF in Fig, 8*i0a, the CDF is given by (Fig* 8.10c) 

F x (x) - 1 -Q(x) (8.37) 

The function <Q(x) is tabulated in Table 8*2 (see also later: Fig. S. 12d)* This function is widely 
tabulated and can be found in most of the standard mathematical tables* 2 * 5 It can be shown 
that, 4 


Q(x) ~ ~==e~ x2/2 for x» 1 (8.38a) 

For example, when x = 2, the error in this approximation is 18*7 %* But for x = 4 it is 10.4% 
and forx = 6 it is 2*3%. 

A much better approximation to Q(x) is 


Q(x) ^ 


1 

x«j2n 



x > 2 


(8.38b) 


The error in this approximation is just within 1% forx > 2.15. For larger values of Jt the error 
approaches 0. 

A more general Gaussian density function has two parameters ( m , a) and is (Fig* 8.11) 


Px(x) = 


L -jx-m) 2 j2a 2 


(8.39) 


For this case, 


F x (*) = —~ f X e- {x ~ m)1 l 2<yl dx 
U+J'ljz 7—00 


* The function £>0) is closely reluted to functions erf (je) and erfc [*)> 

2 2 

ertc (j) - — js= / e~ y dy — 2Q(xV2) 
Jx 


Q(x) = - erfc 



- erf 



Therefore, 
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TABLE 8.2 3 

£2C*> 


X 

0,00 

0.01 

0.02 

0.03 

0.04 

0,05 

0.06 

0.07 

0.08 

0.09 

0.0000 

.5000 

.4960 

.4920 

.4880 

.4840 

.4801 

.4761 

.4721 

,4681 

,4641 

.1000 

.4602 

.4562 

.4522 

.4483 

.4443 

.4404 

.4364 

,4325 

.4286 

.4247 

.2000 

.4207 

.4168 

.4129 

.4090 

.4052 

.4013 

,3974 

.3936 

.3897 

,3859 

.3000 

.3821 

.3783 

.3745 

.3707 

.3669 

.3632 

.3594 

.3557 

.3520 

.3483 

,4000 

.3446 

,3409 

.3372 

.3336 

.3300 

.3264 

.3228 

,3192 

.3156 

.3121 

,5000 

.30115 

.3050 

.3015 

.2981 

.2946 

,2912 

.2877 

.2843 

.2810 

.2776 

.6000 

.2743 

.2709 

.2676 

,2643 

.2611 

.2578 

.2546 

.2514 

.2483 

.2451 

,7000 

.2420 

.2389 

.2358 

,2327 

.2296 

.2266 

,2236 

.2206 

.2177 

,2148 

.8000 

.2119 

.2090 

.2061 

.2033 

,2005 

.1977 

.1949 

.1922 

,1894 

.1867 

.9000 

.1841 

.1814 

,1788 

,1762 

.1736 

.1711 

.1685 

.1660 

.1635 

.1611 

1.000 

.1587 

.1562 

.1539 

.1515 

.1492 

,1469 

.1446 

.1423 

.1401 

,1379 

1.100 

.1357 

.1335 

.1314 

,1292 

,1271 

.1251 

.1230 

.1210 

.1190 

.1170 

1.200 

.1151 

,1131 

,1112 

.1093 

.1075 

.1056 

,1038 

,1020 

.1003 

.9 85 3 E-01 

1.300 

.9680E-01 

.9510E-01 

.9342E-01 

9176E-01 

.9012E-01 

.8851E-01 

.869IE-01 

.8534E-01 

8379E-01 

.8226E-01 

1.400 

S076E-01 

7927E-01 

7780E-01 

.7636E-01 

7493E-01 

7353E-01 

.7215E-01 

.7078E-01 

.6944E-01 

.6811 E-01 

1.500 

.6681 E-01 

.6552E-01 

.6426E-01 

.6301 E-01 

6178E-01 

.6057E-01 

.5938E-01 

.5821E-01 

.5705E-01 

.5592E-01 

1,600 

.5480E-01 

.5370E-01 

.5262E-01 

5155E-01 

.5050E-01 

4947E-0l 

.4846E-01 

4746E-01 

,4648E-01 

4551E-01 

1.700 

.4457E-01 

4363E-01 

.4272E-01 

.4182E-01 

4093E-01 

.4006E-01 

3920E-01 

.3836E-01 

.3754E-01 

.3673E-01 

1.800 

,3593E-Ol 

.3515E-01 

.3438E-01 

.3362E-01 

.3288E-01 

.3216E-01 

.3144E-01 

.3074E-01 

.3005E-01 

.2938E-01 

1.900 

.2872E-01 

.2807E-01 

2743E-01 

2680E-01 

.2619E-01 

.2559E-01 

2500E-01 

.2442E-01 

.2385E-01 

.2330E-01 

2.000 

.2275E-01 

2222E-01 

.2169E-01 

,2118E-01 

2068E-01 

2018E-01 

1970E-01 

.1923E-01 

.1876E-01 

.1831 E-01 

2.100 

.1786E-01 

.1743E-0J 

.1700E-01 

. 1659E-01 

.1618E-01 

.1578E-01 

.1539E-01 

.1500E-01 

1463E-01 

, 1426E-01 

2.200 

.1390E-01 

.1355E-01 

.1321 E-01 

.1287E-01 

,1255E-01 

.1222E-01 

.1191 E-01 

,1160E-01 

1130E-01 

, 1101E-01 

2.300 

1072E-01 

.1044E-01 

. 1017E-01 

9903E-02 

.9642E-02 

9387E-02 

.9137E-02 

.8894E-02 

8656E-02 

8424E-02 

2.400 

.8198E-02 

,7976E-02 

.7760E-02 

7549E-02 

.7344E-02 

.7143E-02 

.6947E-02 

.6756E-02 

6569E-02 

.6387E-02 

2.500 

.6210E-02 

6037E-02 

.5868E-02 

.5703E-02 

.5 543 E-02 

5386E-02 

,5234E-02 

.5085E-02 

4940E-02 

4799E-02 

2,600 

,4661 E-02 

.4527E-02 

4396E-02 

.4269E-02 

-4145E-02 

.4O25E-02 

.3907E-02 

.3793E-02 

,3681 E-02 

.3573E-02 

2.700 

.3467E-02 

.3364E-02 

■3264E-02 

.3167E-02 

.3072E-02 

.2980E-02 

.2890E-02 

.2803E-02 

,2718E-02 

.263 5 E-02 

2.800 

.2555E-02 

,2477E-02 

.2401E-02 

.2327E-02 

2256E-02 

.2186E-02 

2118E-02 

.2052E-02 

, 1988E-02 

1926E-02 

2.900 

, 1866E-U2 

. 1807E-02 

. 1750E-02 

. 1695E-02 

,1641 E-02 

.1589E-02 

. J538E-02 

, 1489E-02 

.1441 E-02 

.1395E-02 

3.000 

. 1350E-02 

, 1306E-02 

, 1264E-02 

, 1223E-02 

. 11S3E-02 

1144E-02 

. 1107 E-02 

, 1070E-02 

. 103 5 E-02 

.1001 E-02 

3.100 

.9676E-03 

.9354E-03 

.9043E-03 

.8740E-03 

.8447E-03 

8164E-03 

.7888E-03 

7622E-03 

7364E-03 

7114E-03 

3,200 

.6871 E-03 

.6637E-03 

.6410E-03 

619OE-03 

,5976E-03 

.5770E-03 

.5571 E-03 

.5377E-03 

.5190E-03 

5009E-03 

3.300 

4834E-03 

.4665E-03 

,4501E-03 

.4342E-03 

.4189E-03 

.4041 E-03 

.3897E-03 

,375SE-03 

.3624E-G3 

.3495E-03 

3.400 

.3369E-03 

.3248E-03 

.3131E-03 

3018E-03 

,2909E-Q3 

.2802E-03 

.2701 E-03 

.2602E-03 

.2 507 E-03 

2415E-03 

3.500 

2326E-03 

.2241 E-03 

2158E-03 

.2078E-03 

.2001E-03 

. 1926E-03 

. 1854E-03 

1785E-03 

.1718E-03 

. 1653E-03 

3.600 

,1591E-03 

.153 IE-03 

.1473E-03 

.1417E-03 

. 1363E-03 

.1311 E-03 

,1261 E-03 

.1213E-03 

. 1166E-03 

.1121 E-03 

3.700 

, 1078E-03 

.1036E-03 

.9961E-04 

9574E-04 

9201E-04 

8S42E-04 

.8496E-04 

8162E-04 

.7841 E-04 

.7532E-04 

3.800 

.7235E-04 

6948E-04 

6673E-04 

,6407E-04 

6I52E-04 

.5906E-04 

5669E-04 

.5442E-04 

.5223E-04 

.5012E-04 

3.900 

.481OE-04 

.4615E-04 

.4427E-04 

.4247E-04 

4074E-04 

.3908E-04 

.3747E-04 

.3594E-04 

.3446E-04 

3304E-04 

4,000 

.3167E-04 

.3036E-04 

.2910E-04 

-2789E-04 

.2673E-04 

.256 IE-04 

.2454E-04 

.2351E-04 

.2252E-04 

.2157E-04 

4.100 

.2066E-04 

1978E-04 

, 1894E-04 

.1814E-04 

, 1737E-04 

. 1662E-04 

.1591E-04 

.1523E-04 

. 1458E-04 

, 1395E-04 

4.200 

1335E-04 

,1277E-04 

.1222E-04 

. 1168E-04 

,1118E-04 

.1069E-04 

.1022E-04 

.9774E-05 

.9345E-05 

.8934E-05 

4.300 

.8540E-05 

.8163E-05 

.7801E-05 

7455E-05 

.7124E-05 

8807E-05 

.6503E-05 

.6212E-05 

.5934E-05 

.566SE-05 

4.400 

5413E-05 

,5l69E-05 

4935E-05 

4712E-05 

4498E-05 

4294E-05 

,4098E-05 

.3911 E-05 

.3732E-05 

.3561 E-05 

4,500 

.3 398 E-05 

.3241E-05 

3092E-05 

.2949E-05 

,28l3E-05 

.2682E-05 

.2558E-05 

.2439E-05 

,2325E-05 

.2216E-05 

4.600 

.2112E-05 

2013E-G5 

I919E-05 

1828E-05 

. 1742E-05 

1660E-05 

,1581 E-05 

1506E-05 

. 1434E-05 

.1366E-05 

4.700 

,1301 E-05 

1239E-05 

. 1179E-05 

, 1123 E-05 

1069E-05 

,l0l7E-05 

,9680E-06 

.9211 E-06 

.8765E-06 

8339E-06 

4,800 

,79 3 3 E-06 

7547E-06 

.7178E-06 

,6S27E-06 

.6492E-06 

.6173E-06 

.5869E-06 

,558O£-06 

,5304E-06 

.5042E-06 

4.900 

.4792E-06 

4554E-06 

4327E-06 

,4111 E-06 

.3906E-0 6 

.3711 E-06 

3525E-06 

3448E-06 

,3179E-06 

.3019E-06 

5.000 

2867E-06 

2722E-06 

.2584E-06 

.2452E-06 

,2328E-06 

2209E-06 

2096E-06 

.1989E-06 

. 1887E-06 

. 1790E-06 

5.100 

. 1698E-06 

,1611E-06 

.1528E-06 

. 1449E-06 

.1374E-06 

1302E-06 

. 1235E-06 

, 1170E-06 

. 1109E-06 

,1051 E-06 


(i continued ) 
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TABLE 8.2 

Continued 


X 

0.00 

0.01 

0.02 

0.03 

0.04 

0.05 

0.06 

0.07 

O.OR 

0,09 

5.200 

.9964E-07 

.9442E-07 

. 8946E-07 

.8476E-07 

,8029E-07 

.7605E-07 

.7203E-07 

.6821 E-07 

.6459E-07 

.6116E-07 

5,300 

,5790E-07 

.5481E-07 

5188E-07 

,4911 E-07 

.4647E-07 

4398E-07 

.4161 E-07 

.3937E-07 

■3724E-07 

.3523E-07 

5.400 

.3332E-07 

3151E-07 

■2980E-07 

.2818E-07 

,2664E-07 

.2518E-07 

.2381 E-07 

.2250E-07 

.2127E-07 

.2010E-07 

5.500 

.1899E-07 

.J794E-07 

, 1695E-07 

.1601 E-07 

.1512E-07 

.1428E-07 

.1349E-07 

.1274E-07 

.1203E-07 

.1135 E-07 

5.600 

.1072E-07 

. 1012E-07 

.9548E-08 

.901 OE-08 

.8503E-08 

.8022E-08 

7569E-Q8 

7140E-08 

.6735E-0R 

6352E-08 

5.700 

5990E-08 

■5649E-08 

.5326E-08 

.5022E-08 

.4734E-08 

.4462E-08 

.4206E-08 

.3964E-08 

3735E-08 

3519E-0S 

5.800 

.3316E-08 

,3124E-08 

.2942E-08 

.277 IE-08 

.261OE-08 

.2458£-08 

■2314E-08 

.2179E-08 

.2051 E-08 

.1931 E-08 

5.900 

.1818E-08 

.1711E-08 

.161OE-08 

1515E-08 

.1425E-08 

.1341 E-08 

.1261E-08 

. 11SGE-08 

.1116E-08 

.1049E-08 

6.000 

.9866E-09 

.9276E-09 

.8721E-09 

.8198E-09 

.7706E-09 

.7242E-09 

.6806E-09 

.6396E-09 

.6009E-09 

.5646E-09 

6,100 

5303E-09 

.4982E-09 

.4679E-09 

4394E-09 

.4126E-09 

.3874E-09 

3637E-09 

3414E-09 

.3205E-09 

3008E-09 

6.200 

.2823E-09 

.2649E-09 

24S6E-09 

.2332E-09 

.2188E-09 

2052E-09 

1925E-09 

. 1805E-09 

.1692E-09 

.1587E-09 

6.300 

, 1488E-09 

.1395E-09 

.1308E-09 

, 1226E-09 

. 1149E-09 

1077E-09 

, 1009E-09 

.945 IE-10 

■8854E-10 

.8294E-10 

6.400 

.7769E-10 

.7276E-10 

6814E-10 

.6380E-10 

.5974E-10 

.5593E-10 

.5235E-10 

.4900E-10 

.4586E-10 

.4292E-10 

6.500 

.4016E-10 

.3758E-10 

3515E-10 

.3288E-10 

.3077E-10 

.2877E-10 

2690E-10 

2516E-10 

.2352E-10 

.2199E-I0 

6.600 

.2056E-10 

1922E-10 

.I796E-10 

.1678E-10 

2568E-10 

.1465E-10 

. 1369E-10 

. 1279E-] 0 

. 1195E-10 

.1116E-10 

6.700 

1042E-10 

.973 IE-11 

.9086E-11 

.84R3E-11 

.7919E-11 

.7392E-11 

.69O0E-11 

.6439E-11 

.6009E-11 

.5607E-l 1 

6.800 

.5231E-I l 

4880E-11 

.4552E-U 

.4246E-11 

.3960E-11 

3692E-11 

.3443E-11 

3210E-11 

.2993E-11 

2790E-11 

6.900 

.2600E-11 

2423E-11 

.2258E-11 

.2104E-11 

.1960B-11 

.1826E-11 

.1701E-11 

1585E-11 

.1476E-11 

. 1374E-11 

7.000 

.1280E-11 

. 1192E-11 

.1109E-11 

.1033E-11 

.9612E-12 

.R946E-12 

.8325E-12 

.7747E-12 

.7208E-12 

.6706E-12 

7.100 

.6238E-12 

.5802E-12 

5396E-12 

501SE 12 

4667E-12 

.4339E-12 

■4034E-12 

.3750E-12 

.3486E-12 

3240E-12 

7.200 

.301 IE-12 

-2798E-12 

.2599E-12 

2415E-12 

.2243E-12 

.2084E-12 

, 1935E-12 

.1797E-12 

.1669E-12 

. 1550E-12 

7.300 

.I439E-12 

.1336E-12 

. 1240E-12 

. 115 IE-12 

.1068E-12 

9910E-13 

.9196E-13 

.853 IE-13 

.7914E-13 

.734 IE-13 

7,400 

6809E-13 

6315E-13 

.5856E-13 

5430E-13 

.5034E-I3 

4667E-13 

.4326E-13 

.4010E-13 

.3716E-13 

.3444E-13 

7.500 

3I91E-13 

.2956E-13 

.2739E-13 

.2537E-13 

■2350E-13 

.2176E-13 

.2015E-13 

.1866E-13 

. 1728E-13 

. 1600E-13 

7,600 

, 1481E-13 

.1370E-13 

1268E-13 

, 1174E-13 

J0S6E-13 

.1005E-13 

.9297E-14 

.8600E-14 

.7954E-14 

.7357E-14 

7.700 

.6803E-14 

.6291E-14 

.5816E-14 

.5377E-14 

497 IB-14 

.4595E-14 

.4246E-14 

.3924E-14 

3626E-14 

.3350E-14 

7.800 

,3095E-14 

.2859E-14 

.264IE-14 

2439E-14 

.2253E-14 

2080E-14 

,192 iE-14 

. 1773E-14 

.1637E-14 

,151 IE-14 

7.900 

.1395E-14 

.1287E-14 

,1188E-14 

.1096E-14 

.1011E-14 

.9326E-15 

.8602E-15 

7934E-15 

,7317E-15 

.6747E-15 

8.000 

.6221E-15 

5735E-15 

.5287E-15 

.4874E-15 

,4492E-15 

.4140E-15 

.3815E-15 

3515E-15 

.3238E-15 

.2983E-15 

8.100 

.2748E-15 

.253IE-15 

.233 IE-15 

,2146E-15 

,1976E-15 

1820E-15 

. 1675E-15 

, 1542E-15 

-1419E 15 

1306E-15 

8.200 

. 1202E-15 

,1106E-15 

.1018E-15 

.9361E-16 

8611E-16 

.7920E-16 

7284E-16 

-669SE 16 

.6159E-16 

5662E-16 

8.300 

.5206E-16 

.4785E-16 

.4398E-16 

,4042E-16 

.3715E-16 

.3413E-I6 

3136E 16 

.288IE-16 

.2646E-16 

,243IE-16 

8.400 

.2232E-16 

.2050E-16 

1882E-16 

1728E-16 

1587E-16 

.1457E-16 

. 1337E-16 

, 1227E-16 

1126E-16 

, 1033E-16 

8.500 

.9480E-17 

8697E-17 

.7978E-17 

7317E-17 

.671 IE-17 

.6154E-17 

.5643E-17 

5174E-17 

.4744E-17 

.4348E-17 

8,600 

3986E-17 

.3653E-17 

.3348E-17 

.3068E-17 

.281 IE-17 

,2575E-17 

.2359E-17 

.2161E-17 

.1979E-17 

1812E-17 

8,700 

. 1659E-17 

1519E-17 

.1391E-17 

,1273E-17 

1166E-17 

I067E-L7 

.9763E-18 

.R933E-18 

.8174E-18 

.7478E-1S 

8.800 

,684IE-18 

.6257E-18 

.5723E-18 

5234E-18 

.4786E-18 

.4376E-18 

.400 IE-18 

.3657E-18 

3343E-1S 

.3055E-18 

8.900 

,2792E-18 

.2552E-1S 

2331E-18 

,2l30E-J8 

.1946E-18 

.1777E-18 

.1623E-18 

.14S3E-18 

1354E-18 

1236E48 

9.000 

, 1129E-18 

. 1030E-18 

,9404E-19 

.8584E-19 

7834E-19 

.7148E-19 

.6523E-19 

.5951E-19 

,5429E-19 

,4952E-19 

9.100 

4517E-19 

4119E-19 

.3756E-19 

3425E-19 

3123E-19 

.2847E-19 

2595E-19 

.2365E-19 

.2155E-19 

.1964E-19 

9.200 

.1790E-19 

. 1631E-19 

1486E-19 

.1353E-I9 

,1232E-19 

1122E-19 

.1022E-19 

.9307E-20 

.8474E-20 

,7714E-20 

9.300 

.7022E-20 

,6392E-20 

.5817E-20 

.5294E-20 

,48l7E-20 

.4382E-20 

.3987E-20 

,3627E-20 

,3299E-20 

3000E-20 

9.400 

.2728E-20 

.248IE-20 

.2255E-20 

,2050E-20 

. 1864E-20 

. 1694E-20 

,1540E-20 

. 1399E-20 

.1271E-20 

1155E-20 

9.500 

.1049E-20 

9533E-21 

.8659E-21 

.7864E-21 

-7U2E-21 

.6485E-21 

.5888E-21 

.5345E-21 

4852E-21 

,4404E-21 

9.600 

.3997E-21 

,3627E-21 

.3292E-21 

.2986E-21 

.2709E-21 

.245SE-21 

.2229E-21 

.2022E-21 

.1834E-2I 

.1663E-21 

9,700 

1507E-21 

.1367E-21 

. 1239E-21 

.I123E-2I 

.1018E-21 

.9223E-22 

.8358E-22 

.7573E-22 

.6861E-22 

6215E-22 

9,800 

.5629E-22 

5098E-22 

.4617E-22 

4181E-22 

.3786E-22 

.3427E-22 

,3102E-22 

2808E-22 

.2542E-22 

,2300E-22 

9.900 

2081E-22 

.1883E-22 

. 1704E-22 

.1541E-22 

, 1394E-22 

. 1261E-22 

1140E-22 

.1031E-22 

.9323E-23 

.8429E-23 

10,00 

.7620E-23 

.6888E-23 

,6225E-23 

.5626E-23 

.5084E-23 

.4593E-23 

.4150E-23 

.3749E-23 

.3386E-23 

-305SE-23 

Notes: 

(1) E-Gl should be read as x 10“ 1 ; E-02 should be read as x 10 -3 , and so on, 

(2) This table lists 2 (jc) for x in the range of 0 to 10 in the increments of 0,01. To find Q(5 .36), for example, 
look up the row starting with .r = 5.3. The sixth entry in this row (under 0,06) is the desired value 

0.4161 x 10" 7 . 
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Letting (jc — m)ja ~ z. 


F*(x) = 



= 1 ~Q 



e 


-t 2 / 2 


dz 


Therefore, 


,<„>,> _ e (ir2) 


(8.40a) 


(8.40b) 


(8.40c) 


The Gaussian PDF is perhaps the most important PDF in the field of communications* The 
majority of the noise processes observed in practice are Gaussian. The amplitude n of a Gaussian 
noise signal is an RV with a Gaussian PDF This means the probability of observing n in an 
interval (rc, n + Arc) is p n (ri) Arc, where p n (n) is of the form in Eq. (8.39) [with m — 0]* 


Example 8*16 Threshold Detection 

Over a certain binary channel, messages m~ 0 and 1 are transmitted with equal probability by 
using a positive and a negative pulse, respectively. The received pulse corresponding to 1 is 
p{t}* shown in Fig. 8.12a, and the received pulse corresponding to 0 is —p(t). Let the peak 
amplitude of p(r) be A p at t — T p . Because of the channel noise n(r), the received pulses will 
be (Fig* 8,I2c) 


± Pit) + n(f) 


To detect the pulses at the receiver, each pulse is sampled at its peak amplitude. In the absence 
of noise, the sampler output is either A p (for m=l) or -A p (for m=0). Because of the channel 
noise, the sampler output is ±A P + n, where n, the noise amplitude at the sampling instant 
(Fig. 8* 12b), is an RV. For Gaussian noise, the PDF of n is (Fig. 8* 12b) 


Pn(n) = 


1 



-n 2 /2ol 


(8*41) 
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Figure 8.12 

Error probability 
in threshold 
detection: 

(a) transmitted 
pulse; (b) noise 
PDF; (c) received 
pulses with noise; 
jd) detection 
error probability. 






(Correct detection) 



A p + n > 0 
(Correct detection) 



Because of the symmetry of the situation, the optimum detection threshold is zero; that 
| is, the received pulse is detected as a 1 or a 0, depending on whether the sample value is 
$ positive or negative, 

% Because noise amplitudes range from -oo to oo, the sample value -A p 4- n can 
§1 occasionally be positive, causing the received 0 to be read as 1 (see Fig* 8* 12b). Similarly, 
% A p + n can occasionally be negative, causing the received 1 to be read as 0 . If 0 is 
p transmitted, it will be detected as 1 if —A p + n > 0, that is, if n > A p > 

# If ^{clO) is the error probability given that 0 is transmitted, then 


P(m = P(n>A p ) 
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Because P(n > A p ) is the shaded area in Fig, 8,12b to the right of A p , from Eq* (8.40c) 
[with m = 0] it follows that 


Similarly, 


and 


^10) = c? 


(£) 


P(e|l) = /»(n < ~A p ) 

= q(—) =nm 


P e = J^P(e, mi) 

i 

= J2P(mi)P(e\mi) 

i 

-«G)' 


(8.42a) 


(8 + 42b) 


(8.42c) 


The error probability P e can be found from Fig* 8*12& 


Joint Distribution 

For two RVs x and y, we define a CDF F xy (x, y) as follows: 

Ftyfa y ) = P(x < x and y < y) (8,43) 


and the joint PDF p xy (x, y) as 


P*y(x, y) = 


9x 9v 


F\ y {x, y) 


(8.44) 


Arguing along lines similar to those used for a single variable, we can show that as Ax 0 
and Ay —* 0 


Px y(Jf, y) Ax Ay = P(x < x < x + Ax, y <y <y + Ay) (8*45) 

Hence, the probability of observing the variables x in the interval (x, x + Ax) and y in the 
interval (y, y 4- Ay) jointly is given by the volume under the joint PDF p xy (x , y) over the 
region bounded by (x, x + Ax) and (y, y -f Ay), as shown in Fig* 8.13a. 

From Eq* (8,45), it follows that 


P(x\ < X < x 2? yi < y < y 2 ) = / / Pxy(x, y)dxdv (8,46) 

JX] «/>■] 

Thus, the probability of jointly observing x in the interval (xi,X 2 ) and y in the interval (yi, y%) 
is the volume under the PDF over the region bounded by (xu * 2 ) and (y i, >> 2 ). 
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Figure 8.13 

(a) Joint PDF. 

(b) Conditional 
PDF. 




The event of observing x in the interval (—oo, oc) and observing yin the interval (™oo, oo) 
is a certainty. Hence, 

/ OO pOO 

f Pxy(x, y)dxdy = 1 (8.47) 

-00 J — o o 

Thus, the total volume under the joint PDF must be unity. 

When we are dealing with two RVs x and y, the individual probability densities p x {x) 
and p y (y) can be obtained from the joint density p xy (*, y ). These individual densities are also 
called marginal densities. To obtain these densities, we note that p x (x) Aa is the probability 
of observing x in the interval (a, x + Ax). The value of y may lie anywhere in the interval 
(-oo, oo). Hence, 


lim /? x (a)Aa = lim Probability {x < x < x + Aa, —oo < y < oo) 
Aje^O A.v->0 


px-tnx p oc 

= lim / / P\y (jf, y)dxdy 

l 1 Jx J—oc 

/ oo px+Ax 

P*y(x,y)dy dx 

-oo Jx 


— lim 

Ajf->0 


/ oo 

p x y (A, v) dy 
-oo 
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The last two steps follow from the fact that p x> f.v. v'; is constant over :.v. x -f Aa; because 
A a —>■ 0. Therefore, 

P*(x) = [ !hy(x, y)dy (8.48a) 

J —OO 

Similarly, 

/ OO 

Pxy(x, y)dx (8.48b) 

-oo- 

In terms of the CDF, we have 

Fy(y) = F xy (oo, y) 

FM OO) 

These results may be generalized for multiple RVs X] , X 2 , .... x H 

Conditional Densities 

The concept of conditional probabilities can be extended to the case of continuous RVs, We 
define the conditional PDF /? X | y (x|v/) as the PDF of x given that y has the value yj. This is 
equivalent to saying thatp X |y(x|yy) Ax is the probability of observing x in the range ( x , x+ Ax), 
given that y —yj. The probability density Px|y(xLy/) is the intersection of the plane y=yj with 
the joint PDF p xy (x , y) (Fig, 8.13b). Because every PDF must have unit area, however, we 
must normalize the area under the intersection curve C to unity to get the desired PDF. Hence, 
C is ,4/? X | y (x|y), where A is the area under C. An extension of the results derived for the discrete 
case yields 


(8,49a) 

(8.49b) 


and 


PxiyCvlyjPyCv) =/>jcy(*t y) (8.50a) 

Py|x(j f Wp*W y(AV >9 (8.50b) 


/J*|y(*|v) 


Py|xCv|*)/>xOO 

Py(y) 


(8.51a) 


Equation (8,5la) is Bayes' rule for continuous RVs* When we have mixed variables (i.e*, 
discrete and continuous), the mixed form of Bayes 1 rule is 


^x|y (x\y)p y (y) = PMPyiAyW (8.51b) 

where x is a discrete RV and y is a continuous RV* 

Note that p x \ y (x|>) is still, first and foremost, a probability density function. Thus, 



p x \ y (x\y)dx — 


f-ooPxyjx, y)dx 
Py(y) 


Py(}’) 

Py(y) 


(8.52) 


It may be worth noting that P x |y(*|y) is conditioned on an event y = >■ that has probability zero. 



Independent Random Variables 

The continuous RVs x and y are said to be independent if 
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= PA*) 

In this case from Eqs. (8.53a) and (8.51) it follows that 

Py|x(>’W =Py(y) 

This implies that for independent RVs x and y, 

Pxy(x, y) =pAx)p y (y) 


Based on Eq. (8.53c), the joint CDF is also separable: 

F*y(x, y) = f f Pxy(v, w)dwdv 
J —OQ j —OO 

= f fx (v) dv • f p y (w) dw 
J-OQ J-oo 


= FAX) ■ Fy(y) 


(8.53a) 


(8,53b) 


(8.53c) 


(8.54) 


Example 8.17 Rayleigh Density 

The Rayleigh density is characterized by the PDF (Fig. 8.14b) 


Pt(r) = 


r > o 
a 

0 r < 0 


(8.55) 


Figure 8.14 

Derivation of the 
Rayleigh density. 
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A Rayleigh RV can be derived from two independent Gaussian RVs as follows. Let x and 
y be independent Gaussian variables with identical PDFs: 


PM = 

Py(y) = 


l 


1 

a V2jt 




-y 2 flv 2 


Then 


p X y(x, y) = Px(x)p y (y) = 2^2 e {x2+y2)/2al 


(8.56) 


The joint density appears somewhat like the bell-shaped surface shown in Fig. 8.13. The 
points in the (x,y) plane can also be described in polar coordinates as ( r, 6), where 
(Fig* 8.14a) 


r = Jx 2 + y 2 0 = tan -1 - 
y x 

In Fig. 8.14a, the shaded region represents r < r < r + dr and 6 < © < B + d$ (where 
dr and dB both 0)* Hence, if p t e(r f &) is the joint PDF of r and 0, then by definition 
[Eq. (8.45)1, the probability of observing r and 0 in this region is p t e(r , 0) drd $* But we 
also know that this probability is p xy (x, >) times the area rdrdO of the shaded region* 
Hence, [Eq. (8*56)] 


_L e -(* 2 +>' 2 )2<7 2 rdrdB _ p {r Q) drd Q 

2na 2 


and 


Pre(r> 6) 


r c -{x 2 +y 2 )/2<r 2 

Ina 2 
2 na 2 


(8.57) 


and [Eq. (8.48a)] 


/ OO 

Pi& (r, 8) d9 

-CO 

Because 0 exists only in the region (0, 2 tt), 




_ _l e -W 

~ a 2 


u{r) 


(8.58a) 


Note that r is always greater than 0. In a similar way, we find 

PB(0) = 


2“ 0 < 0 < 2jt 


(8,58b) 


0 otherwise 
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RVs r and 0 are independent because p r ®{r, $) — p v (r)pe{0). The PDF p r (r) is the 
Rayleigh density function. We shall later show that the envelope of narrowband Gaussian 
noise has a Rayleigh density. Both p r (r) and pe(0) are shown in Fig. 8* 14b and c* 


8.3 STATISTICAL AVERAGES (MEANS) 


Averages are extremely important in the study of RVs* To find a proper definition for the 
average of a random variable x, consider the problem of determining the average height of the 
entire population of a country. Let us assume that we have enough resources to gather data 
about the height of every person. If the data is recorded within the accuracy of an inch, then 
the height x of every person will be approximated to one of the n numbers x ]f ... ,x n . If 
there are Ni persons of height x/, then the average height x is given by 

_ NiX] H- N2X2 + ■ * ■ + N n x n 

x = - 

N 

where the total number of persons is N — JL Hence, 


N 1 N 2 N n 

~T7 x l + ~rr x 2 H-H" — 

N N N 


In the limit as N —> oo, the ratio Ni/N approaches P x {xi) according to the relative frequency 
definition of the probability. Hence, 


n 

i=l 

The mean value is also called the average value, or expected value, of the RV x and is denoted 
by £[x]* Thus, 


x = £fx] = ^xiPxixi) (8.59a) 

i 

We shall use both these notations, our choice depending on the circumstances and convenience* 
If the RV x is continuous, an argument similar to that used in arriving at Eq. (8.59a) yields 

/ oc 

xp x (x)dx (8.59b) 

-oc 

This result can be derived by approximating the continuous variable x with a discrete variable 
by quantizing it in steps of Ax and then letting Ax 0* 

Equation (8*59b) is more general and includes Eq. (8.59a), because the discrete RV can be 
considered as a continuous RV with an impulsive density. In such a case, Eq. (8.59b) reduces 
to Eq* (8*59a)* 

As an example, consider the general Gaussian PDF given by (Fig. 8.11) 

P*(x) = -L =e -0-”>W 
<7 v2tt 


(8*60a) 
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~(x-m) 2 /2tr‘ 


From Eq. (8.59b) we have 

Changing the variable to x = y + m yields 

1 f°° _ 7n ’ 

x - —— f (y + rn)e y ' 2a dy 
a^/ln J-oo 

1 C °° "y "y f 1 r 00 

= —j= / + m — U / 

(j\/2n J-oo Lo'V2tt 


/2ff^ 




The first integral inside the bracket is zero, because the integrand is an odd function of y. The 
term inside the square brackets is the integration of the Gaussian PDF, and is equal to 1. Hence, 


x = m 


(8.60b) 


Mean of a Function of a Random Variable 

It is often necessary to find the mean value of a function of a RV. For instance, in practice we 
are often interested in the mean square amplitude of a signal. The mean square amplitude is 
the mean of the square of the amplitude x, that is, x 2 . 

In general, we may seek the mean value of an RV y that is a function of the RV x; that is, 
we wish to find y where y = g(x). Let x be a discrete RV that takes values ;q, jr 2 , *.. ,jc n with 
probabilities P K (x\), P x (*2), ■ ■ ■ respectively. But because y = g(x), y takes values 

gfe), ■ ■ ■ ,g(Xn) with probabilities P x Cq), /^Cq), ... ^(j^), respectively. Hence, 
from Eq. (8.59a) we have 


n 

7 = sOO - ^2g(xi)P*(xi) 

[=1 

If x is a continuous RV, a similar line of reasoning leads to 


■?(*) = f g(x)p x (x)dx 

J —OO 


(8.61a) 


(8.61b) 


Example 8.18 The output voltage of sinusoid generator is A cos tot. This output is sampled randomly 
(Fig. 8.15a). The sampled output is an RV x, which can take on any value in the range (— A , A). 
Determine the mean value (x) and the mean square value (x 2 ) of the sampled output x. 

If the output is sampled at a random instant r, the output x is a function of the RV t: 

x(0 — A cos cot 

If we let tot = 0, © is also an RV, and if we consider only modulo-2 jt values of 0, then 
the RV 0 lies in the range (0, 2jr). Because t is randomly chosen, 0 can take any value 
in the range (0, 2n) with uniform probability. Because the area under the PDF must be 
unity, pe{0) is as shown in Fig. 8.15b. 
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Figure 8.15 

Random 
sampling of a 
sine-wave 
generator. 


Sine-wave 

generator 



o 



VoUmeter 


(a) 


/V 0 ) 



1 



2ir 



T 

2 tt 6 -► 


(b) 


The RV x is thus a function of another RV, 0, 

x = A cos 0 


Hence, from Eq. (8.61b), 

| c2n 

x — / xpe(9)d8 = — I Aco&&d8 = Q 

Jo Jq 


and 


rlir 

Jo 


x l pe($)d$ 


r 2jr 

2tt Jo 


cos 2 0d& 


Similarly, for the case of two variables x and y, we have 


g(x, y) - 



s(*. y)p*.y(s< y)dxdy 


(8.62) 


Mean of the Sum 

If gi(x, y), g 2 { x, y), ... ,g„(x, y) are functions of the RVs x and y, then 

siO, y) + g2(x, y) -I- y) = £i(x* y)+£2(x, y) H — + s*(x, y) (8.63a) 

The proof is trivial and follows directly from Eq. (8.62). 

Thus, the mean (expected value) of the sum is equal to the sum of the means. An important 
special case is 


x + y = x + y 


(8.63b) 


Equation (8.63a) can be extended to functions of any number of RVs. 
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Mean of the Product of Two Functions 

Unfortunately, there is no simple result [as in Eq, (8,63 )J for the product of two functions. For 
the special case where 

g(x, y) = £i(x)g 2 (y) (8.64a) 

/ OO POO 

I giWg2(y)Pxy(x, y)dxdy 

*oc J — oa 

If x and y are independent, then [Eq, (8.53c)] 

P\y(x* y ) = Px(*)PyG0 
and 

/ OO rOC 

g\(x)p x (x)dx / g%(y)p y (y)dy 
■oo J—oo 

= £i( x ) giiy) if X and y independent (8.64b) 

A special case of this is 

xy = x y if x and y independent (8.64c) 


Moments 

The nth moment of an RY x is defined as the mean value of x rt . Thus, the /?th moment of x is 


/ oo 

x n p x (x)dx 

‘OO 

The nth central moment of an RV x is defined as 


(8.65a) 


(x-x) 


" - f (x- x) n p x {x)dx 

J — O O 


(8 + 65b) 


The second central moment of an RV x is of special importance. It is called the variance of 
x and is denoted by <r x 2 , where a x is known as the standard deviation (SD) of the RV x. By 
definition, 


(T 2 = {x - x) 2 

= x 2 - 2xx + x 2 = x 2 — 2x 2 + x 2 

= x^-x 2 (8.66) 

Thus, the variance of x is equal to the mean square value minus the square of the mean. When 
the mean is zero, the variance is the mean square; that is, x 2 = cr 2 t 
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Example 8.19 Find the mean square and the variance of the Gaussian RV with the PDF in Eq. (839) [see 
Fig. 8.11]. 

We have 

X s = — 5 = H x 2 e - { *- m)ll2al dx 
0\J2tZ J-OO 

Changing the variable to y = (x - m)/u and integrating, we get 

x 2 =o 2 + m 2 (8.67a) 

Also, from Eqs. (8-66) and (8.60b), 

a 2 = x 2 — x 2 

= (a 2 + m 2 ) “ (m ) 2 

= o 2 (8.67b) 

i Hence, a Gaussian RV described by the density in Eq, (8.60a) has mean m and variance a 2 . 
| In other words, the Gaussian density function is completely specified by the first moment 
1 (x) and the second moment (x 2 ). 



Example 8.20 Mean Square of fhe Uniform Quantization Error in PCM 


In the PCM scheme discussed in Chapter 6, a signal band-limited to B Hz is sampled at 
a rate of 2 B samples per second. The entire range (— m p , m p ) of the signal amplitudes is 
partitioned into L uniform intervals, each of magnitude 2 m p /L (Fig. 8.16a). Each sample is 
approximated to the midpoint of the interval in which it falls. Thus, sample m in Fig. 8.16a 
is approximated by a value in , the midpoint of the interval in which m falls. Each sample 
is thus approximated (quantized) to one of the L numbers. 

_ The difference q = m — m is the quantization error and is an RV. We shall determine 
q 2 , the mean square value of the quantization error. From Fig. 8.16a it can be seen that q 
is a continuous RV existing over the range (- m p /L , m p /L) and is zero outside this range. 
If we assume that it is equally likely for the sample to he anywhere in the quantizing 
interval/ then the PDF of q is uniform 

Pq(q) = L/2m p q e (-m p /L, m p /L) 


* Because the quantizing interval is generally very small, variations in the PDF of signal amplitudes over the 
interval are small and this assumption is reasonable. 





Example 8.21 Mean Square Error Caused by Channel Noise in PCM 

Quantization noise is one of the sources of error in PCM* The other source of error is channel 
noise. Each quantized sample is coded by a group of n binary pulses. Because of channel noise, 
some of these pulses are incorrectly detected at the receiver* Hence, the decoded sample value 
m at the receiver will differ from the quantized sample value m that is transmitted* The error 
£ = A — m is a random variable. Let us calculate the mean square error in the sample value 
caused by the channel noise. 
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To begin with, let us determine the values that € can take and the corresponding probabili¬ 
ties. Each sample is transmitted by n binary pulses* The value of € depends on the position 
of the incorrectly detected pulse* Consider, for example, the case of L = 16 transmitted by 
four binary pulses (n — 4), as shown in Fig. 1.5. Here the transmitted code 1101 represents 
a value of 13* A detection error in the first digit changes the received code to 0101 , which 
is a value of 5. This causes an error £ = 8. Similarly, an error in the second digit gives 
€ = 4. Errors in the third and the fourth digits will give f. =2 and € — 1, respectively. 
In general, the error in the ith digit causes an error e, = (2~ l ) 16. For a general case, the 
error €,■ = (2 _i )F, where F is the full scale, that is, 2in PCM* Thus, 


= (2 l )(2m p ) i = 1, 2, ***,/? 
Note that the error ^ is a discrete RV. Hence,* 


€ 2 = J2 € f p ^ 

i=[ 


(8.69) 


Because /%(£,) is the probability that € ~ e,, /^(q) is the probability of error in the 
detection of the ith digit* Because the error probability of detecting any one digit is the 
same as that of any other, that is, P e , 


e 2 = PeJ2*f 

i= 1 

= P e j^Am 2 p (2~ 2i ) 
1=1 

= ±2~* 

i =1 


This summation is a geometric progression with a common ratio r = 2 2 , with the first 
term a\ = 2“ 2 and the last term a n = 2~ ln . Hence (see Appendix E.4), 


= 4 m p P e 


{2~ A )2~ ln - 2 
2 -2 - i 


-2i 


4mjP e (2 2n - 1) 
3(2 2 «) 


(8*70a) 


Note that the magnitude of the error € varies from 2 ~ l (2m p ) to 2^(2^)* The error e 
can be positive as well as negative. For example, e = 8 because of a first-digit error 
in 1101 * But the corresponding error € will be -8 if the transmitted code is 0101 * Of 
course the sign of € does not matter in Eq* (8.69). It must be remembered, however, 
that € varies from -2~ n {2m p ) to 2~ n {2m p ) and its probabilities are symmetrica! about 


* Here we are assuming that the error can occur only in one of the n digits. But more than one digit may be in error. 
Because the digit error probability P e I (on the order 10"^ or less), however, the probability of more than one 
wrong digit is extremely small (see Example S.6), and its contribution efp € (€ z ) is negligible. 
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e = 0. Hence, e = 0 and 


2 4m 2 p P e (2 2n - 1 ) 

a = — — - - 

6 3(22") 


(8.70b) 


Variance of a Sum of Independent Random Variables 

The variance of a sum of independent RVs is equal to the sum of their variances. Thus, if x 
and y are independent RVs and 


z = x + y 

then 

= + (8-71) 

This can be shown as follows: 

<J Z Z = (z - z) 2 = [x + y - (x + y)] 2 
= [(x -x) + (y - y)] 2 
= (x ~ x) 2 + (y - y) 2 + 2(x - x)(y - y) 

= a 2 + a 2 + 2(x - x)(y - y) 

Because x and y are independent RVs, (x - x) and (y - y) are also independent RVs. Hence, 

from Eq t (S + 64b) we have 


(x — x)Cy — y) = (x — x) ■ {y — y) 

But 


(x — x) = \ — x = x — x = 0 

Similarly, 


(y - y) = o 
and 

G z “ 

This result can be extended to any number of variables. If RVs x and y both have zero means 
(i,e., x = y = 0), then z - x + y - 0. Also* because the variance equals the mean square value 
when the mean is zero, it follows that 


z 2 = (x + y) 2 = X 2 + y 2 (8.72) 


provided x = y = 0, and provided x and y are independent RVs, 
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Example 8.22 Total Mean Square Error in PCM 

$ In PCM, as seen in Examples 8,20 and 8.21, a signal sample m is transmitted as a quan- 
f!, tized sample m, causing a quantization error q = m — m, Because of channel noise, the 
^ transmitted sample m is read as m, causing a detection error € = m - m. Hence, the actual 
& signal sample m is received as m with a total error 

I 

g m — m — fm — hi) + (m — m) = q + € 

1 

m where both q and £ are zero mean RVs, Because the quantization error q and the channel- 

I noise error € are independent, the mean square of the sum is [see Eq, (8.72)] 

(m - m) 2 = (q + e) 2 = q 2 + £ 2 

1 /m p ^ 4rripP f (2 2n — 1) 

” 3 \ L ) + 3(2 2m ) 

1 

| Also, because L = 2*, 

| = ? + ^ = t^-L1+4^(2 2 "- 1)] (8.73) 

% 3(2^) 


Chebyshev’s Inequality 

The standard deviation of an RV x is a measure of the width of its PDF. The larger the a x , the 
wider the PDE Figure 8,17 illustrates this effect for a Gaussian PDF. Chebyshev’s inequality 
is a statement of this fact. It states that for a zero mean RV x 

P( |x| < k<r x ) > 1 - 2 (8.74) 

This means the probability of observing x within a few standard deviations is very high. For 
example, the probability of finding |x| within 3a x is equal to or greater than 0.88, Thus, for a 
PDF with = 1, P(|x| < 3) > 0.88, whereas for a PDF with o x — 3, P(|x| < 9) > 0.88. It 
is dear that the PDF with tr x = 3 is spread out much more than the PDF with cr x = 1. Hence, 


Figure 8.17 

Gaussian PDF 
with standard 
deviations a = 1 
and a = 
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ct x or is often used as a measure of the width of a PDF. In Chapter 10, we shall use this 
measure to estimate the bandwidth of a signal spectrum. The proof of Eq. (8.74) is as follows: 


/ OO 

r 

-oo 


‘p x (x)dx 


Because the integrand is positive, 


- f x 


'Px(x)dx 


If we replace x by its smallest value Act*, the inequality still holds, 

a l £ ( Ih(x) dx - k 2 a 2 P(\x\ > ka x ) 

v|jr|>*tr s 


or 


Hence, 


P(|x| > Acr x ) < 


P(|x| < k( 7 X ) > 1 


k 2 


k 2 


This inequality can be generalized for a nonzero mean RV as: 

1 


P(|x - x| < ka x ) > 1 


k 2 


(8.75) 


Example 8.23 Estimate the width, or spread, of a Gaussian PDF [Eq. (8.60a)] 

For a Gaussian RV [see Eqs. (8.35) and (8.40b)] 

P(|x - x| < o) = 1 - 2Q(1) = 0.6826 
F(|x ~ x| < 2cr) = 1 — 2Q(2) = 0.9546 
P(|x — x| < 3er) = 1 — 2Q(3) = 0.9974 

This means that the area under the PDF over the interval (x - 3o, x + 3cr) is 99.74% of the 
total area. A negligible fraction (0.26%) of the area lies outside this interval. Hence, the 
width, or spread, of the Gaussian PDF may be considered roughly ±3 <7 about its mean, 
giving a total width of roughly 6er. 


8.4 CORRELATION 


Often we are interested in determining the nature of dependence between two entities, such 
as smoking and lung cancer Consider a random experiment with two outcomes described by 
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RVs x and y. We conduct several trials of this experiment and record values of x and y for each 
trial. From this data, it may be possible to determine the nature of a dependence between x 
and y, The covariance of RYs x and y is one measure that is simple to compute and can yield 
useful information about the dependence between x and y. 

The covariance o xy of two RVs is defined as 


= (x - x)(y — y) (8.76) 

Note that the concept of covariance is a natural extension of the concept of variance, which is 
defined as 


ct x 2 = (x — x) (x - x) 

Let us consider a case of two variables x and y that are dependent such that they tend 
to vary in harmony; that is, if x increases y increases, and if x decreases y also decreases. 
For instance, x may be the average daily temperature of a city and y the volume of soft drink 
sales that day in the city. It is reasonable to expect the two quantities to vary in harmony for a 
majority of the cases. Suppose we consider the following experiment: pick a random day and 
record the average temperature of that day as the value of x and the soft drink sales volume 
that day as the value of y. We perform this measurement over several days (several trials of 
the experiment) and record the data x and y for each trial. We now plot points (x, y) for all 
the trials. This plot, known as the scatter diagram, may appear as shown in Fig. 8.18a. The 
plot shows that when x is large, y is likely to be large. Note the use of the word likely. It is not 
always true that y will be large if x is large, but it is true most of the time. In other words, in 
a few cases, a low average temperature will be paired with higher soft drink sales owing to 
some atypical situation, such as a major soccer match. This is quite obvious from the scatter 
diagram in Fig. 8.18a. 

To continue this example, the variable x - x represents the difference between actual 
and average values of x, and y — y represents the difference between actual and average 
values of y. It is more instructive to plot (y — y) vs. (x — x). This is the same as the scatter 
diagram in Fig. 8.18a with the origin shifted to (x, y), as in Fig. S.18b, which shows that a day 
with an above-average temperature is likely to produce above-average soft drink sales, and 
a day with a below-average temperature is likely to produce below-average soft drink sales. 


Figure 8,18 

Scatter 

diagrams: 

[a], (b) positive 
correlation; 

(c) negative 
correlation; 

(d) zero 
correlation. 
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That is, if x — x is positive, y - y is likely to be positive, and if x — x is negative, y — y 
is more likely to be negative. Thus, the quantity (x - x)(y - y) will be positive for most 
trials. We compute this product for every pair, add these products, and then divide by the 
number of trials. The result is the mean value of (x - x)(y - y), that is, the covariance tr xy = 
(x — x)(y — y). The covariance will be positive in the example under consideration. In such 
cases, we say that a positive correlation exists between variables x and y. We can conclude that a 
positive correlation implies variation of two variables in harmony (in the same direction, up or 
down). 

Next, we consider the case of the two variables: x, the average daily temperature, and z, 
the sales volume of sweaters that day. It is reasonable to believe that as x (daily average 
temperature) increases, z (the sweater sales volume) tends to decrease. A hypothetical scatter 
diagram for this experiment is shown in Fig. 8.18c. Thus, if x - x is positive (above-average 
temperature), z - z is likely to be negative (below-average sweater sales). Similarly, when 
x — x is negative, z — z is likely to be positive. Th e product (x — x)fe — z) will be negative 
for most of the trials, and the mean (x - x)(z - z) = a xz will be negative. In such a case, we 
say that negative correlation exists between x and y. It should be stressed here that negative 
correlation does not mean that x and y are unrelated. It means that they are dependent, but 
when one increases, the other decreases, and vice versa. 

Last, consider the variables x (the average daily temperature) and w (the number of births). 
It is reasonable to expect that the daily temperature has little to do with the number of children 
born. A hypothetical scatter diagram for this case will appear as shown in Fig. 8.1Sd. If x — x 
is positive, w — w is equally likely to be positive or negative. The product (x — \)(w — w) is 
therefore equally likely to be positive or negative, and the mean (x — x)(w — w) = a xw will 
be zero. In such a case, we say that RVs x and w are uncorrelated. 

To reiterate, if cr xy is positive (or negative), then x and y are said to have a positive (or 
negative) correlation, and if a xy — 0, then the variables x and y are said to be uncorrelated. 

From this discussion, it appears that under suitable conditions, covariance can serve as 
a measure of the dependence of two variables. It often provides some information about the 
interdependence of the two RVs and proves useful in a number of applications. 

The covariance o xy may be expressed in another way, as follows. By definition, 


a xy — (x — x)(y — y) 


= xy - xy — xy 4- xy 


— xy - xy — xy + xy 

= xy - xy (8.77) 


From Eq. (8.77) it follows that the variables x and y are uncorrelated (cr xy — 0) if 


xy = xy (8.78) 

The correlation between x and y cannot be directly compared with the correlation between z 
and w. This is because different RVs may differ in strength. To be fair, the covariance value 
should be normalized appropriately. For this reason, the definition of correlation coefficient 
is particularly useful. Correlation coefficient p xy is a xy normalized by a x a y , 

a \y 

a X !J \- 


Px y = 


(8.79) 
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Thus, if x and y are uncorrelated, then p xy — 0. Also, it can be shown that (Prob. 8*5-5) 

that 


-1 < Pxy < 1 


(8.80) 


Independence vs* Uncorrelatedness 

Note that for independent RVs [Eq* (8,64c)] 

xy = xy and a xy = 0 

Hence, independent RVs are uncorrelated. This supports the heuristic argument presented 
earlier It should be noted that whereas independent variables are uncorrelated, the converse 
is not necessarily true—uncorrelated variables are generally not independent (Prob. 8.5-3). 
Independence is, in general, a stronger and more restrictive condition than uncorrelatedness. 
For independent variables, we have shown [Eq. (8.64b)] that, when the expectations exist, 


giOOgiiy) -£j 00 g2(y) 

for any functions g]( ) and #2(), whereas for uncorrelatedness, the only requirement is that 

xy = xy 

There is only one special case for which independence and uncorrelatedness are equivalent— 
when random variables x and y are jointly Gaussian, Note that when x and y are jointly 
Gaussian, individually x and y are also Gaussian* 

Mean Square of the Sum of Uncorrelated Variables 

If x and y are uncorrelated, then for z = x + y we show that 

G z = -^y ( 8 - 81 ) 

That is, the variance of the sum is the sum of variances for uncorrelated RVs, We have proved 
this result earlier for independent variables x and y. Following the development after Eq. (8.71), 
we have 


a\ = [(x - x) + (y - y)] 2 

- (X - X) 2 + (y - y) 2 + 2(x-x)(y-y) 

= O' 2 + Gy + 2£Txy 

Because x and y are uncorrelated, a xy = 0, andEq. (8.81) follows. If x and y have zero means, 
then z also has a zero mean, and the mean square values of these variables are equal to their 
variances. Hence, 


(x + y) 2 = x 2 + y 2 (8.82) 

if x and y are uncorrelated and have zero means. Thus, Eqs. (8.81) and (8,82) are valid not 
only when x and y are independent, but also under the less restrictive condition that x and y 
be uncorrelated* 
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8.5 LINEAR MEAN SQUARE ESTIMATION 

When two random variables x and y are related (or dependent), then a knowledge of one gives 
certain information about the other. Hence, it is possible to estimate the value of (parameter 
or signal) y from a knowledge of the value of x . The estimate of y will be another random 
variable y. The estimated random variable y will in general be different from the actual y. 
One may choose various criteria of goodness for estimation. Minimum mean square error is 
one possible criterion. The optimum estimate in this case minimizes the mean square error € 2 
given by 


f 2 = (y -y ) 2 

In general, the optimum estimate y is a nonlinear function of x.* We simplify the problem by 
constraining the estimate y to be a linear function of x of the form 

y — ax 

assuming that x — 0/ In this case, 

6 2 = (y - y) 2 = (y - ax) 2 
= y 2 H- a 2 x 2 — 2oxy 


To minimize € 2 , we have 


Hence, 


— _ 

—- = 2 ax 2 — 2xy = 0 
da 


a = 


xy 



(8.83) 


where = xy, = x 2 , and R yy = y 2 t Note that for this constant choice of a , 


€ 


= y - a* = y - 


R 


xy 


X 


Hence, 


X€ 



* It can be shown that 5 the optimum estimate y is the conditional mean of y when x = x, that is, 

y = E[y | x = x] 


In general, this is a nonlinear function of x, 

r Throughout the discussion, the variables x, y,... will be assumed to have zero mean. This can be done without loss 
of generality. If the variables have nonzero means, we can form new variables x f = x — x and y f = y — y, 
and so on. The new variables obviously have zero mean values. 
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Since by definition xy = R xy and xx — X 2 = R kx , we have 

x? = J ff xy -/? xy = 0 (8.84) 

The condition of Eq. (8*84) is known as the principle of orthogonality. The physical inter¬ 
pretation is that the data (x) used in estimation and the (minimum) error (e) are orthogonal 
(implying uncorrelatedness in this case) when the mean square error is minimum. 

Given the principle of orthogonality, the minimum mean square error is given by 


e 2 = (y — ax) 2 

— (y — ax)y - a -ex 

= (y - «x)y 
= y 2 ■“ a ' yx 

= R yy — aRxy (8.85) 

Using n Random Variables to Estimate a Random Variable 

If a random variable xo is related to n RVs xj, X 2 , * ■ ■, x Hl then we can estimate xo using a 
linear combination* ofxi, X 2 , ,,-,x rt : 


n 

xo = a[X\+a 2 X 2 -\ - Va fl x n = (8,86) 

i=i 


The mean square error is given by 

* 2 = [xo - (tfixi +^2x2 + ■ ■ ■ +^x n )] 2 

To minimize e 2 , we must set 

d€ 2 
da\ 

that is. 


(8.87a) 

(8,87b) 


^2 ^ _ 

T— = t—[ xo - (flixi +fl2X2 +-H a„\„)] 2 = 0 

uQi 

Interchanging the order of differentiation and averaging, we have 
‘de 2 


dai 


= —2[x 0 - (tfixi +a 2 X 2 + 11 ♦ + a n x rt )]x/ = 0 


Equation (8.87a) can be written as 

€ > x,- = 0 i = 1, % ..., n 


d€ 2 _ _ $€ 2 

dat da n 


* Throughout this section as before, we assume that all the random variables have zero mean values. This can be 
done without loss of generality. 
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It can be rewritten into Yule-Walker equations 


R()i — a[Ri\ + aiRft 4- ■ ■ ■ + 


( 8 . 88 ) 


where 


Rij = 

Differentiating € 2 with respect to a\, a 2 , *.. ,a n and equating to zero, we obtain n simultaneous 

equations of the form shown in Eq. (8.88). The desired constants a 2 _, a n can be found 

from these equations by matrix inversion 


a\ 

a 2 


‘ *u 
*21 

*12 
*22 ' 

o $ OS 

-1 

*01 

*02 

a n J 


_ *Ml 

*n2 ' 

' *™ _ 


>3 - 
o 
a 

1 _ 


Equation (8.87) shows that 6 (the error) is orthogonal to data (xi, x 2 , ..., x„) for optimum 
estimation. This gives the more general form for the principle of orthogonality in mean square 
estimation. Consequently, the mean square error (under optimum conditions) is 


z 1 = ^ = 6 [xq - (fliX! + a 2 x 2 + ■ + fl„x H )] 

Because €x} = 0 (i = 1, 2, ..., n), 

€ 2 = €Xfi 

= xofx 0 — (a\X] -f a 2 x 2 +-1- a n x n )J 

= #00 - («l#0l + £2#02 H-F OnRon) (8.90) 


Example 8.24 In differential pulse code modulation (DPCM), instead of transmitting sample values directly, 
we estimate (predict) the value of each sample from the knowledge of previous n samples. 
The estimation error £*, the difference between the actual value and the estimated value of the 
kth sample, is quantized and transmitted (Fig. 8.19). Because the estimation error f. k is smaller 
than the sample value , for the same number of quantization levels (the same number of 
PCM code bits), the SNR is increased. It was shown in Sec. 6.5 that the SNR improvement 
is equal to m 2 /e 2 , where m 2 and c 2 are the mean square values of the speech signal and the 
estimation error €, respectively. In this example, we shall find the optimum linear second-order 
predictor and the corresponding SNR improvement. 

I The equation of a second-order estimator (predictor), shown in Fig. 8.19, is 

rhfc = tfinijt-i + a 2 vcik-2 

where mjt is the best linear estimate of m*. The estimation error € k is given by 


€ k = mjt - m k - a\m k ^i a 2 m k - 2 m* 
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For speech signals, Jayant and Noll 5 give the values of correlations of various 
samples as: 

m k = m 2 , m k _\ = 0.825m 2 , mjtmt _2 = 0.562m 2 , 
mimjt -3 — 0.308m 2 , mTmjt ~4 = 0,004m 2 , m k m^_ 5 = —0.243m 2 

Note that Ry — m k m Hence, 

R[] —^72 — 

R \2 — /?2i — Rq\ = 0.825m 2 
Rq 2 = 0.562m 2 

The optimum values of a\ and a*i are found from Eq* (8*89) as a\ = 1.1314 and 
ai — -0.3714, and the mean square error in the estimation is given by Eq* (8*90) as 

?= [1 - {0.825i 3i + 0,562#2)]m 2 = 0.2753m 2 (8.91) 

The SNR improvement is 10 log 10 m 2 /0*2752m 2 — 5*6 dB* 


8.6 SUM OF RANDOM VARIABLES 

In many applications, it is useful to characterize the RV z that is the sum of two RVs x and y: 

z = x + y 

Because z = x + y, y = z — x regardless of the value of x* Hence, the event z < z is the joint 
event [y < z — x and x to have any value in the range {—oo, oo)]* Hence, 

F z {z) = P (z < z) = P(x < oo, y < z -x) 

/ CO fZ—X 

f Pxy(x, y) dy dx 

-ooJ-oo 

pz-x 


-r*i 

J—oo J-oo 


Pxyix, y)dy 
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and 


P7.(Z ) 


dF T jz) 

dz 


-£ 


Z — x) dx 


If x and y are independent RVs, then 


Pxy(X, Z-x)= fh(x)Py(z - x) 

and 

/ oo 

Px(x)Py(z -x)dx (8.92) 

-OO 

The PDF p z (z) is then the convolution of PDFs p x (z.) and Py(z)- We can extend this result to 
a sum of n independent RVs xi, X 2 .x„. If 


z = XI + X2 H-h x M 


then the PDFp z (z) will be the convolution of PDFs p xi (x), p Xl (x) . p Xn (x), that is, 


Pz(x) = p Xl (Jt) * Pz 2 (*) * • • ■ * Px„ (x) 


(8.93) 


Sum of Gaussian Random Variables 

Gaussian random variables have several very important properties. For example, a Gaussian 
random variable x and its probability density function p x (x) are fully described by the mean 
rix and the variance tr*. Furthermore, the sum of any number of jointly distributed Gaussian 
random variables is also a Gaussian random variable, regardless of their relationships (such as 
dependency). Again, note that when the members of a set of random variables {x, } are jointly 
Gaussian, each individual random variable x; also has Gaussian distribution. 

As an example, we will show that the sum of two independent, zero mean, Gaussian 
random variables is Gaussian. Let xj and X 2 be two zero mean and independent Gaussian 
random variables with probability density functions 

Px|(*) = — 7 = —and p X2 (x) — -=L—e - * 2 /^ 2 ) 

v'-.T a i v-~ o'2 


Let 


y = xi +x 2 


The probability density function of y is therefore 

/ OO 

-OO 


p xl (x)p X2 (y-x)dx 
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Upon carrying out this convolution (integration), we have 

- 2 0-*) 2 ' 


,,_i r l * 2 

Py y 2n<7\u 2 \ 2 a\ 2(72 J 

1 


^2n(a^ + it 2 2 ) 




dx 


/ CO 

-00 


1 


exp 


2 4^1 L 


-|2 


<Tf + &n 


dx 


(8.94) 


By a simple change of variable 


vi+°i y 




_sm= 


°?+°2 


we can rewrite the integral of Eq. (8.94) as 


Py(y) 


^2n{a} + al) 


e -is?=£_i_ r 

\fhz J-> 


e~i w dw . 


■J2.jr(af + a\) 




(8,95) 


By examining Eq. (8.95), it can be seen that y is a Gaussian RV with zero mean and variance: 

_ 2 _ , _2 

(Jy — Oj -\-& 2 

In fact, because xi and X2 are independent, they must be uncorrelated. This relationship can 
be obtained from Eq. (8,81). 

More generally, 5 if xj and X 2 are jointly Gaussian but not necessarily independent, then 
y = Xi + X 2 is Gaussian RV with mean 


y = xi +x 2 


and variance 


a y = *1 + + 2ff xix, 

Based on induction, the sum of any number of jointly Gaussian distributed RV’s is still 
Gaussian, More importantly, for any fixed constants { ai, i = 1, .. ♦, m] and jointly Gaussian 
RVs {x/, e = 1, ,,,, m] f 


m 

E a ' xi 

r=l 

remains Gaussian. This result has important practical implications. For example, if x* is a 
sequence of jointly Gaussian signal samples passing through a discrete time filter with impulse 
response {hi}, then the filter output 


CO 

y = 

i =0 


(8.96) 
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will continue to be Gaussian, The fact that linear filter output to a Gaussian signal input will be 
a Gaussian signal is highly significant and is one of the most useful results in communication 
analysis. 


8.7 CENTRAL LIMIT THEOREM 

Under certain conditions, the sum of a iarge number of independent RVs tends to be a Gaussian 
random variable, independent of the probability densities of the variables added,* The rigorous 
statement of this tendency is what is known as the central limit theorem, 7 Proof of this theorem 
can be found in the Refs. 6 and 7. We shall give here only a simple plausibility argument. 

The tendency toward a Gaussian distribution when a large number of functions are con¬ 
volved in shown in Fig. 8.20, For simplicity, we assume all PDFs to be identical, that is, a gate 
function 0,5 11(a:/2). Figure 8,20 shows the successive convolutions of gate functions. The 
tendency toward a bell-shaped density is evident. 

This important result that the distribution of the sum of n independent Bernoulli random 
variables, when properly normalized, converges toward Gaussian distribution was established 
first by A. de Moivre in the early 1700s, The more general proof for an arbitrary distribution 
was credited to J, W, Lindenber and P, Levy in the 1920s. Note that the “normalized sum” is 
the sample average (or sample mean) of n random variables. 


Central Limit Theorem (for the sample mean): 

Let Xi, ..,, be independent random, samples from a given distribution with mean p and 
variance o 2 with 0 < a 2 < oc. Then for any value x, we have 


lim P 


1 n 

_L x * ~ ^ 

cr 






-^ 2 dv 


or equivalently, 


lim P 

n-»oo 



= Q(x) 


(8.97) 


(8.98) 


Note that 


X rt 


Xl H-+ X n 

n 


Figure 8.20 

Demonstration of 
the central limit 
theorem. 



* If the variables are Gaussian, this is true even if the variables are not independent, 
t Actually a group of theorems collectively called the central limit theorem. 
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is known as the sample mean . The interpretation is that the sample mean of any distribu¬ 
tion with nonzero finite variance converges to Gaussian distribution with fixed mean p and 
decreasing variance o 2 /n. In other words , regardless of the true distribution ofxj, x, 
can be approximated by a Gaussian distribution with mean np and variance 


Example 8,25 Consider a communication system that transmits a data packet of 1024 bits. Each bit can be 
in error with probability of 10" 2 . Find the (approximate) probability that more than 30 of the 
1024 bits are in error 

Define a random variable x; such that x, = 1 if the fth bit is in error and x ( - = 0 if not. 
Hence 


is the number of errors in the data packet. We would like to find P(v > 30), 

Since P(x^ = 1) = 10 -2 and = 0) = 1 - 10 -2 , strictly speaking we would 
need to find 


P(v > 30) 




This calculation is time-consuming. We now apply the central limit theorem to solve this 
problem approximately. 

First, we find 

= 10" 2 x (1) + (1 - 1CT 2 ) x (0) = 10 -2 

xf = io~ 2 x (i) 2 + (i - icr 2 ) x (0) = icr 2 


As a result. 


vf = x 2 - (x7) 2 = 0.0099 


Based on the central limit theorem, v = ^ x, is approximately Gaussian with mean of 
1024 ■ 10 -2 = 10.24 and variance 1024 x 0.0099 = 10,1376. Since 

_ v - 10.24 
y " Vl 0.1376 


is a standard Gaussian with zero mean and unit variance, 


P(y> 30) 


= p(y 


30 — 10.24' 
VlO.1376 , 


= P(y > 6.20611) 
= 0(6.20611) 

2: 1.925 x 10 -10 
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Now is a good time to further relax the conditions in the central limit theorem for the sample 
mean. This highly important generalization is proved by the famous Russian mathematician 
A. Lyapunov in 1901. 


Central Limit Theorem (for the sum of independent random variables): 

Let random variables X],.. M x„k independent but not necessarily identically distributed. 
Each of the random variable x, has mean ^ and nonzero variance of < oo. Furthermore, 
suppose that each third-order central moment 


jx,- - pf* < oc ( i = 1, ..n 


and suppose 


lim 

n^oo 



= 0 


Then random variable 


y (n) = ^ i=] Xi ^ i=[ ^ 

s/E'I-i 

converges to a standard Gaussian density as n —*■ oo, that is, 

lim P [y(n) > x] = Q( x ) (8.99) 

The central limit theorem provides a plausible explanation for the well-known fact that 
many random variables in practical experiments are approximately Gaussian. For example, 
communication channel noise is the sum effect of many different random disturbance sources 
{e + g., sparks, lightning, static electricity). Based on the central limit theorem, noise as the sum 
of all these random disturbances should be approximately Gaussian. 
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8*1-1 A card is drawn randomly from a regular deck of cards. Assign probability to the event that the 
card drawn is: (a) a red card; (b) a black queen; (c) a picture card (count an ace as a picture 
card); (d) a number card with number 7; (e) a number card with number < 5. 

8*1-2 Three regular dice are thrown. Assign probabilities to the following events: the sum of the points 
appearing on the three dice is (a) 4; (b) 9; (c) 15. 

8.1- 3 The probability that the number i appears on a throw of a certain loaded dice is kj (i = 

1, 2, ,6)* Assign probabilities to all six outcomes. 

8.1- 4 A bin contains three oscillator microchips, marked ()], O 2 * and 0-$, and two PLL microchips, 

marked P\ and Two chips are picked randomly in succession without replacement. 

(a) How many outcomes are possible (i.e., how many points are in the sample space)? List all 
the outcomes and assign probabilities to each of them. 

(b) Express the following events as unions of the outcomes in part (a): (i) one chip drawn is 
marked oscillator and the other PLL (ii) both chips are PLL; (iii) both chips are oscil¬ 
lators; and (iv) both chips are of the same kind. Assign probabilities to each of these 
events. 


8.1- 5 Use Eq. (8.12) to find the probabilities in Prob. 8,1 -4, part (b). 

8*1-6 In Prob. 8,1-4, determine the probability that: 

(a) The second pick is ail oscillator chip given that the first pick is a PLL chip, 

(b) The second pick is an oscillator chip given that the first pick is also an oscillator chip* 

8*1-7 A binary source generates digits 1 and 0 randomly with equal probability. Assign probabilities 
to the following events with respect to 10 digits generated by the source: (a) there are exactly 
two Is and eight Os; (b) there are at least four Os, 

8.1- 8 In the California lottery (Lotto), a player chooses any 6 numbers out of 49 numbers (1 

through 49). Six bails are drawn randomly (without replacement) from the 49 balls numbered 
1 through 49. 

(a) Find the probability of matching all 6 balls to the 6 numbers chosen by the player. 

(b) Find the probability of matching exactly 5 balls, 

(c) Find the probability of matching exactly 4 balls, 

(d) Find the probability of matching exactly 3 balls, 

8.1- 9 A network consists of 10 links sq, ^ ’ ■ > ^10 i n cascade (Fig. P8.1-9). If any one of the links 

fails, the entire system fails. All links are independent, with equal probability of failure p = 
0 , 01 . 

(a) What is the probability of failure of the network7 

Hint: Consider the probability that none of the links fails. 

(b) The reliability of a network is the probability of not failing. If the system reliability is 
required to be 0.99, what must be the failure probability of each link? 


\l 







Output 



y 2 






Figure P*8.1-9 input 
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8*1-10 Network reliability improves when redundant links arc used. The reliability of the network in 
Prob. 8J-9 {Fig. P8T-9) can be improved by building two subnetworks in parallel (Fig* P8.1-9). 
Thus, if one subnetwork fails, the other one will still connect. 

(a) Using the data in Prob. 8.1-9, determine the reliability of the network in Fig. P8.1 -10. 

(b) (f the reliability of this new network is required to be 0.999, what must be the failure 
probability of each link? 


Figure 

fcSJ-10 



8*1 -11 Compare the reliability of the two networks in Fig. P8.1 -11, given that the failure probability of 
links and sj is p each. 


Figure 
P.8.1-11 



(a) 


(b) 


8.1-12 In a poker game each player is dealt five cards from a regular deck of 52 cards. What is the 
probability that a player will get a flush (all five cards of the same suit)? 

8*1-13 Two dice are thrown. One die is regular and the other is biased with the following probabilities: 

/>(1) = P( 6) = i P( 2) = P( 4) = 0, P( 3) = P(5) = | 

o 3 

Determine the probabilities of obtaining a sum: (a) 4; (b) 5. 

8*1-14 In Sec, 8.1, Example 8.5* determine: 

(a) P{B ), the probability of drawing an ace in the second draw. 

(b) P(A|£), the probability that the first draw was a red ace given that the second draw is an 
ace. 

Hint: Event B can occur in two ways: the first draw is a red ace and the second draw is an 
ace, or the first draw is not a red ace and the second draw is an ace. This is A n B U A C B (see 
Fig* 8,2)* 

84-15 A binary source generates digits 1 and 0 randomly with probabilities P(l) = 0.8 and 
P{ 0) = 0.2. 


(a) What is the probability that exactly two Is will occur in a Ji-digit sequence? 

(b) What is the probability that at least three Is will occur in a rc-digit sequence? 
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8.1-16 In a binary communication channel, the receiver detects binary pulses with an error probability 
P e - What is the probability that out of 100 received digits, no more than four digits are in 
error? 


8.1- 17 A PCM channel consists of 10 links, with a regenerative repeater at the end of each link. If 

the detection error probabilities of the 15 detectors are p\, P 2 . P\ 5 + determine the detection 

error probability of the entire channel if Pi ^ 1, 

8.1- 18 Example 8,8 considers the possibility of improving reliability by repeating a digit three times. 

Repeat this analysis for five repetitions, 

8.1- 19 A box contains nine bad microchips. A good microchip is thrown into the box by mistake. 

Someone is trying to retrieve the good chip. He draws a chip randomly and tests it. If the chip 
is bad, he throws it out and draws another chip randomly, repeating the procedure until he finds 
the good chip. 


(a) What is the probability that he will find the good chip in the first trial? 

(b) What is the probability that he will find the good chip in five trials? 

8*1-20 One out of a group of 10 people is to be selected for a suicide mission by drawing straws. There 
are 10 straw's: nine are of the same length and the tenth is shorter than the others* Each of the 
10 people draws a straw, one by one. The person who draws the short straw is selected for the 
mission* Determine which position in the sequence favors the most and which favors the least 
drawing the short straw. 

8.2- 1 For a certain binary nonsymmetric channel it is given that 

P y | X (0|l)=0.1 and FyixfllO) — 0.2 

where x is the transmitted digit and y is the received digit. If P x (0) = 0.4, determine P y (0) and 
Py( 1 ). 

8.2- 2 A binary symmetric channel (see Example 8.13) has an error probability P e . The probability of 

transmitting 1 isQ, If the receiver detects an incoming digit as 1, what is the probability that the 
originally transmitted digit was: (a) 1; (b) 0? 

Hint: If x is the transmitted digit and y is the received digit, you are given F y | x (0|l) = 
^y|x(l|®) = Pe- Now using Bayes’ rule, find P x |y(l|l) and P x | y (0|l). 

8.2- 3 The PDF of amplitude x of a certain signal Jt(f) is given by p K (x) = 0.5|jc|c - ^. 


(a) Find the probability that x > 1, 

(b) Find the probability that -1 < x < 2. 

(c) Find the probability that x < -2. 


8,2-4 The PDF of an amplitude x of a Gaussian signal x(f) is given by 


Px(x) = 


1 c -x 2 i2v 2 
ay/2n 


This signal is applied to the input of a half-wave rectifier circuit (Fig. P8.2-4). 

Assuming an ideal diode, determine Fy-(y) and p y (y) of the output signal amplitude y = 
x ■ i*(x), Notice that the probability of x= 0 is not zero* 
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Figure P.8.2-4 


Figure P.8.2-7 



8.2- 5 The PDF of a Gaussian variable x is given by 

■ dr 

Determine: (a) P(x > 4); (b) P(x < 0); (c) P(\ > 

For an RV x with PDF 

hw - id?' 

(a) Sketch p * (x)> and state (with reasons) if this is a Gaussian RV. 

(b) Determine: (i) P(x > 1), (ii)P(J < x < 2) + 

(c) How to generate RV x from another Gaussian RV? Show block diagram and explain. 

8.2- 7 The joint PDF of RVs x and y is shown in Fig. P8.2-7. 

(a) Determine: (i) A; (ll)p x (*); (ili)p y (y); (iv) /> X | y (*|y); (v) / > y | X (yW- 

(b) Are x and y independent? Explain. 


—(*-4) 2 /l8 
~ 2 ). 

-x 2 / 21 



8,2-8 The joint PDF p% y (x, y) of two continuous RVs is given by 

Pxyt*. y) =xye- ( * 2> e-y 1 ? 2 u(x)u(y) 


(a) Find/? x (x), p y (y) T p xjy (jr|y), and p y \ x (y\x). 

(b) Are x and y independent? 

8.2-9 RVs x and y are said to be jointly Gaussian if their joint PDF is given by 


Pxy(x, y) = 


1 

2 xVM* 


(i ax 2 +by 2 - 2cxy) j2M 
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where M = ah - c 2 . Show that pAA, p y (y),p x |y and| x (^l^) are ^ Gaussian and that 
x 2 =b t y 2 = a, and xy = c. 


Hint: Use 

e-P’t+V'dx = 
8*2-10 The joint PDF of RVs x and y is given by 



Pxy (.x, y ) = te -u s +W) 


Determine: (a) the constant k\ (b) p x W; (c) py(y)‘, (d) Py.\ y (x,y)\ (e) p y | X (yW- Are x and y 
independent? 

8*2-11 In the example on threshold detection (Example 8.16), it was assumed that the digits 1 and 0 
were transmitted with equal probability* If P x (l) and /\(0), the probabilities of transmitting 1 
and 0, respectively, are not equal, show that the optimum threshold is not 0 but is a, where 


2A n 


In 


Px(0) 

FA\) 


Hint: Assume that the optimum threshold is a y and write P € in terms of the Q functions. For the 
optimum case, dP e jda — 0, Use the fact that 


Cto = 1 


\phc 


-r 

jr J ^ 


-y a /2 


dy 


and 


dQM 

dx 


_ L *-* 2 /2 


8.3-1 If an amplitude x of a Gaussian signal x(f) has a mean value of 2 and an RMS value of 3, 
determine its PDF. 


8*3-2 Determine the mean, the mean square, and the variance of the RV x in Prob. 8.2-3. 

8.3- 3 Determine the mean and the mean square value of RV x in Prob. 8.2-4* 

8.3- 4 Determine the mean and the mean square value of RV x in Prob. 8.2-6. 

8.3- 5 Find the mean, the mean square, and the variance of the RV x in Fig. P8.3-5. 


Py(*) 



Figure P*8*3-5 
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Figure P.8.6-1 


8.3- 6 The sum of points on two tossed dice is a discrete RV x, as analyzed in Example 8.12. Determine 

the mean, the mean square, and the variance of the RV x. 

8*3-7 For a Gaussian PDF p x (x) = (\io x ^f2n)e~ x2 f 1(T i, show that 

— _ | (1)(3){5) * ■ - (n - l)o£ n even 
[o n odd 

Hint: See appropriate definite integrals in any standard mathematical table, 

8.3- 8 Ten regular dice are thrown. The sum of the numbers appearing on these 10 dice is an RV 

Find x, x 2 , and , 

Hint: Remember that the outcome of each die is independent. 

8.5- 1 Show that |pxy | < 1, where p xy is the correlation coefficient [Eq, (8.79)] of RVs x and y. 

Hint: For any real number a, 

[a(x - x) - (y - y)] 2 > 0 

The discriminant of this quadratic in a is nonpositive. 

8.5- 2 Show that if two RVs x and y are related by 

y = k } x + k 2 

where fci and k 2 arc arbitrary constants, the correlation coefficient p xy = 1 if£[ is positive, and 
Pxy = — 1 if fcj is negative. 

8.5- 3 Given x = cos 0 and y = sin ©, where 0 is an RV uniformly distributed in the range (0, 2 jt)> 

show that x and y are uncorrelated but are not independent. 

8.6- 1 The random binary signal x{/)> shown in Fig. P8.6-la, can take on only two values, 3 and 0, 

with equal probability. An exponential channel noise n(f) shown in Fig. P8.6-lb is added to this 
signal, giving the received signal y(f). The PDF of the noise amplitude n is exponential with a 
zero mean and a variance of 2. Determine and sketch the PDF of the amplitude y. 

Hint: Use of Eq. (8.92) yields p y (y) = p x 00 *pn(«)- 


3 

0 


x(0 


(a) 


n 





8.6-2 Repeat Prob. 8.6-1 if the amplitudes 3 and 0 of x(t) are not equiprobable but P^Q) = 0.6 and 
P* (0) = 0,4, 
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8.6- 3 If x{f) and y(r) are both independent binary signals, each taking on values -1 and 1 only, with 

fvn = e = i-/\(-i) 

Py(\)=P = I — Py(~ I ) 
determine P z (Zi) where z = x + y h 

8.6- 4 If z = x + y, where x and y are independent Gaussian RVs with 

PxW = and p y (y) = 

o x s/2x J 

then show that z is also Gaussian with 

z = x + y and ai — a* + Uy 

Hint: Convolve /? x (x) and Py(y). See pair 22 in Table 3,1, 

8.6- 5 In Example 8.24, design the optimum third-order predictor processor for speech signals and 

determine the SNR improvement. Values of various correlation coefficients for speech signals 
are given in Example 8.24. 



RANDOM PROCESSES AND 
SPECTRAL ANALYSIS 


T he notion of a random process is a natural extension of the random variable (RV), 
Consider, for example, the temperature x of a certain city at noon. The temperature x 
is an RV and takes on different values every day. To get the complete statistics of x, we 
need to record values of x at noon over many days (a large number of trials). From this data, 
we can determine p x U)* the PDF of the RV x (the temperature at noon). 

But the temperature is also a function of time. At 1 p.m., for example, the temperature 
may have an entirely different distribution from that of the temperature at noon. Still, the 
two temperatures may be related, via a joint probability density function. Thus, this random 
temperature x is a function of time and can be expressed as x(f)- If the random variable is 
defined for a time interval t € [ t a , fy], then x(r) is a function of time and is random for every 
instant t € [t a , 4 ]. AnRV that is a function of time* is called a random process, or stochastic 
process. Thus, a random process is a collection of an infinite number of RVs. Communication 
signals as well as noises, typically random and varying with time, are well characterized by 
random processes. For this reason, random process is the subject of this chapter before we 
study the performance analysis of different communication systems. 


9.1 FROM RANDOM VARIABLE TO 
RANDOM PROCESS 

To specify an RV x, we run multiple trials of the experiment and from the outcomes estimate 
p*(x). Similarly, to specify the random process x(t), we do the same thing for each time instant 
t. To continue with our example of the random process x(t), the temperature of the city, we 
need to record daily temperatures for each value of t (for each time of the day). This can be 
done by recording temperatures at every instant of the day, which gives a waveform x(r, £-), 
where & indicates the day for which the record was taken. We need to repeat this procedure 
every day for a large number of days. The collection of all possible waveforms is known as the 
ensemble (corresponding to the sample space) of the random process x(f). A waveform in this 
collection is a sample function (rather than a sample point) of the random process (Fig. 9.1). 


* Actually, to qualify as a random process, x could be a function of any practical variable, such as distance. In fact, a 
random process may also be a function of more than one variable. 
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Figure 9.1 

Random process 
to represent the 
temperature of o 
city. 


Figure 9.2 

Ensemble with a 
Finite number of 
sample functions. 
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Sample function amplitudes at some instant t = t\ are the values taken by the RV x(/i) in 
various trials. 

We can view a random process in another way. In the case of an RV, the outcome of 
each trial of the experiment is a number. We can view a random process also as the outcome 
of an experiment, where the outcome of each trial is a waveform (a sample function) that is 
a function of t. The number of waveforms in an ensemble may be finite or infinite. In the 
case of the random process x(f) (the temperature of a city), the ensemble has infinitely many 
waveforms. On the other hand, if we consider the output of a binary signal generator (over the 
period 0 to \0T) t there are at most 2 10 waveforms in this ensemble (Fig. 9.2). 

One fine point that needs clarification is that the waveforms (or sample functions) in the 
ensemble are not random. They have occurred and are therefore deterministic. Randomness 
in this situation is associated not with the waveform but with the uncertainty as to which 
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waveform would occur in a given trial. This is completely analogous to the situation of an RV. 
For example, in the experiment of tossing a coin four times in succession (Example 8.4), 16 
possible outcomes exist, all of which are known. The randomness in this situation is associated 
not with the outcomes but with the uncertainty as to which of the 16 outcomes will occur 
in a given trial. Indeed, the random process is basically an infinite long vector of random 
variables. Once an experiment is completed, the sampled vector is deterministic. However, 
since each element in the vector is random, the experimental outcome is also random, leading 
to uncertainty over what vector (or function) will be generated in each experiment. 

Characterization of a Random Process 

The next important question is how to characterize (describe) a random process. In some cases, 
we may be able to describe it analytically. Consider, for instance, a random process described 
by x(r) = Acos(w c f + 0), where © is an RV uniformly distributed over the range (0, 2 jt), 
This analytical expression completely describes a random process (and its ensemble). Each 
sample function is a sinusoid of amplitude A and frequency a> t: . Rut the phase is random (see 
later, Fig. 9.5). It is equally likely to take any value in the range (0, 2 jt). Such an analytical 
description requires well-defined models such that the random process is characterized by 
specific parameters that are random variables. 

Unfortunately, it is not always possible to be able to describe a random process analytically. 
Without a specific model, we may have just an ensemble obtained experimentally. The ensemble 
has the complete information about the random process. From this ensemble, we must find 
some quantitative measure that will specify or characterize the random process. In this case, 
we consider the random process as anRV x that is a function of time. Thus, a random process is 
just a collection of an infinite number of RVs, which are generally dependent. We know that the 
complete information of several dependent RVs is provided by the joint PDF of those variables, 
Letx* represent the RVx{*;) generated by the amplitudes of the random process at instant* = r,. 
Thus, X] is the RV generated by the amplitudes at * = t\ , and X2 is the RV generated by the 
amplitudes at r = *2, and so on, as shown in Fig. 9.1. The n RVs xi, X2, X3, ..., x* generated 
by the amplitudes at * = *^, * 2 , *3, .. ., respectively, are dependent in general. For the n 
samples, they are fully characterized by the n th-order joint probability density function or the 
n th-order joint cumulative distribution function (CDF) 


x 2 . Xrf-Jit t2, t n ) — /'[xCO < jci ; x(r) < x 2 ; ...; x(r, t ) < x n ] 


The definition of the joint CDF of the n random samples leads to the joint PDF 

3" 


Px&l, * 2 , ■ - X n \tu * 2 * ■ ■ - tn) = 


9*2 ■ ■ ■ 


F x (xi, x 2 , ..., x n ;fi, *2, t n ) 


(9.1) 


This discussion provides some good insight. It can be shown that the random process is com¬ 
pletely described by the n th-order joint PDF (9.1) for all n (up to 00 ) and for any choice of 
* 1 , 12 , T 3 , ..., Determining this PDF (of infinite order) is a formidable task. Fortunately, 
we shall soon see that when analyzing random signals and noises in conjunction with linear 
systems, we are often content with the specifications of the first- and second-order statistics. 

Ahigher order PDF is the joint PDF of the random process at multiple time instants. Hence, 
we can always derive a lower order PDF from a higher order PDF by simple integration. For 
instance, 

/ OG 

-30 


P\ * 2 ; t%) dx2 
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Figure 9*3 A 

random process 
to represent □ 
channel noise. 



(Sampling instant) 


Hence, when the /ith-order PDF is available, there is no need to specify PDFs of order lower 
than n. 

The mean x(r) of a random process x(t) can be determined from the first-order PDF as 

/ oo 

xpx(x: t)dx (9,2) 

-oo 

which is typically a deterministic function of time r. 

Why Do We Need Ensemble Statistics? 

The preceding discussion shows that to specify a random process, we need ensemble statistics. 
For instance, to determine the PDFp X] (x \), we need to find the values of all the sample functions 
at t = f i. This is ensemble statistics. In the same way, the inclusion of all possible statistics in the 
specification of a random process necessitates some kind of ensemble statistics. In deterministic 
signals, we are used to studying the data of a waveform (or waveforms) as a function of time. 
Hence, the idea of investigating ensemble statistics makes us feel a bit uncomfortable at first. 
Theoretically, we may accept it, but does it have any practical significance? How is this concept 
useful in practice? We shall now answer this question. 

To understand the necessity of ensemble statistics, consider the problem of threshold 
detection in Example 8T6. A 1 is transmitted by /?(f) and a 0 is transmitted by — p(t) (polar 
signaling). The peak pulse amplitude is A p . When 1 is transmitted, the received sample value 
is A p + n, where n is the noise. We would make a decision error if the noise value at the 
sampling instant t s were less than — A pt forcing the sum of signal and noise to fall below the 
threshold. To find this error probability, we repeat the experiment N times (N oo) and see 
how many times the noise at t = t s is less than — A p (Fig. 9.3). This information is precisely 
one of ensemble statistics of the noise process n(t) at instant 

The importance of ensemble statistics is clear from this example. When we are dealing with 
a random process or processes, we do not know which sample function will occur in a given 
trial. Hence, for any statistical specification and characterization of the random process, we 
need to average over the entire ensemble. This is the basic physical reason for the appearance 
of ensemble statistics in random processes. 

Autocorrelation Function of a Random Process 

For the purpose of signal analysis, one of the most important (statistical) characteristics of a 
random process is its autocorrelation function, which leads to the spectral information of the 
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Figure 9.4 

Autocorrelation 
functions for a 
slowly varying 
and a rapidly 
varying random 
process. 
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random process. The spectral content of a process depends on the rapidity of the amplitude 
change with time. This can be measured by correlating amplitudes at t\ and t\ + r. On average, 
the random process x(f) in Fig. 9.4a is a slowly varying process in comparison to the process 
y(0 in Fig. 9.4b. For x(r), the amplitudes at fi and /j + r are similar (Fig. 9.4a), that is, have 
stronger correlation. On the other hand, for y (t), the amplitudes at t\ and t\ + r have little 
resemblance (Fig. 9.4b), that is, have weaker correlation. Recall that correlation is a measure of 
the similarity of two RVs. Hence, we can use correlation to measure the similarity of amplitudes 
at t\ and ^2 = ^1 + r. If the RVs x(fj) and x(f 2 ) are denoted by xi and x 2 , respectively, then 
for a real random process,* the autocorrelation function /? x (q, f 2 ) is defined as 


rtxfa* f2) = x(fi)x(f 2 ) = xiX2 


(9.3a) 


This is the correlation of RVs x(fi) and x(r 2 ), indicating the similarity between RVs x(t\) and 
x(r 2 )- It is computed by multiplying amplitudes at t\ and r 2 of a sample function and then 
averaging this product over the ensemble. It can be seen that for a small r, the product jqjt 2 
will be positive for most sample functions of x(r), but the product yiy 2 is equally likely to 
be positive or negative. Hence, X]X 2 will be larger than yiy 2 . Moreover, xi and x 2 will show 
correlation for considerably larger values of r, whereas yj and y 2 will lose correlation quickly, 
even for small r, as shown in Fig. 9.4c. Thus, , f 2 ), the autocorrelation function of x(r), 
provides valuable information about the frequency content of the process. In fact, we shall 
show that the PSD of x(/) is the Fourier transform of its autocorrelation function, given by 
(for real processes) 


rtxfri, ti) — *ix 2 



* 2 ; tu t 2 )dxidx 2 


(9.3b) 


* For a complex random process x(f), the autocorrelation function is defined as 


o tl) = x*(q)x(/ 2 ) 
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Hence, R x (M , * 2 ) can be derived from the joint PDF of xi and x 2 > which is the second-order 
PDF 

9.2 CLASSIFICATION OF RANDOM PROCESSES 

Random processes may be classified into the following broad categories. 

Stationary and Nonstationary Random Processes 

A random process whose statistical characteristics do not change with time is classified as a 
stationary random process. For a stationary process, we can say that a shift of time origin 
will be impossible to detect; the process will appear to be the same. Suppose we determine 
p x (x; then shift the origin by to, and again determine p x (x; *i). The instant 1 1 in the new 
frame of reference is = ri + to in the old frame of reference. Hence, the PDFs of x at fj and 
t 2 = t\ + to must be the same; that is ,p x (x; fj) and/^Cr; * 2 ) must be identical for a stationary 
random process. This is possible only if p x (x y t) is independent of t . Thus, the first-order 
density of a stationary random process can be expressed as 

p x (x; t) =p x (x) 

Similarly, for a stationary random process the autocorrelation function R x (t \, t 2 ) niust depend 
on t\ and 1 2 only through the difference r 2 —1 \. If not, we could determine a unique time origin. 
Hence, for a real stationary process, 


flxOi, £2) — Rx(t 2 ri) 


Therefore, 


R x (t) = + r) (9.4) 

For a stationary process, the joint PDF for x 1 and ^ 2 must also depend only on r 2 — 1 \. Similarly, 
higher order PDFs are all independent of the choice of origin, that is, 

Px C*i, * 2 ,* ■ -1 tu h . t n ) — Px{x\ y X 2 ,.x n \ t] - t y t 2 - r, ..., t n - t) Vf 

= Px(*u *2*- ■ ■, 0, t 2 - t u ^ t n - t\) (9.5) 

The random process x(r) representing the temperature of a city is an example of a nonsta¬ 
tionary random process because the temperature statistics (mean value, for example) depend 
on the time of the day. On the other hand, we can say that the noise process in Fig. 9.3 is 
stationary because its statistics (the mean and the mean square values, for example) do not 
change with time. In general, it is not easy to determine whether or not a process is stationary 
because the /ith-order (n = 1,2,..., 00 ) statistics must be investigated. In practice, we can 
ascertain stationarity if there is no change in the signal-generating mechanism. Such is the case 
for the noise process in Fig. 9.3. 

Wide-Sense (or Weakly) Stationary Processes 

A process that is not stationary in the strict sense, as discussed in the last subsection, may yet 
have a mean value and an autocorrelation function that are independent of the shift of time 
origin. This means 


x(f) = constant 
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and 


t-2) = R x {t) t = t 2 - U (9.6) 

Such a process is known as a wide-sense stationary, or weakly stationary, process. Note that 
stutionarity is a stronger condition than wide-sense stationarity. Stationary processes with well- 
defined autocorrelation functions are wide-sense stationary; exception for Gaussian random 
processes, however, the converse is not necessarily true. 

Just as no sinusoidal signal exists in actual practice, no truly stationary process can occur 
in real life- All processes in practice are nonstationary because they must begin at some finite 
time and terminate at some finite time. A truly stationary process must start at r ™ —oo and 
go on forever- Many processes can be considered stationary for the time interval of interest, 
however, and the stationarity assumption allows a manageable mathematical model. The use 
of a stationary model is analogous to the use of a sinusoidal model in deterministic analysis. 


Example 9.1 


Show that the random process 


x(f) = A cos {o) c t + 0) 

where 0 is an RV uniformly distributed in the range (0, 2 tt), is a wide-sense stationary process. 


Figure 9.5 

Ensemble for the 
random process 
A cos (Wf/ + 0). 


The ensemble (Fig. 9.5) consists of sinusoids of constant amplitude A and constant fre¬ 
quency (o Ci but the phase 0 is random. For any sample function, the phase is equally 
likely to have any value in the range (0, 2ir). Because 0 is an RV uniformly distributed 
over the range (0, 2 jt), one ca n de termine 1 p x (x, t) and, hence, x(t), as in Eq, (9-2)- For 
this particular case, however, x(t) can be determined directly as a function of random 
variable 0: 

X(f) — A COS (0) c t + 0) = A COS {(D c t 4- 0) 



Because cos -f 0) is a function of an RV 0, we have [see Eq. (8.61b)] 


cos (t o c t + 0) — 


L 




cos (a> c t -b 8)p®(9)d0 
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Because pe(0) = 1 /2tz over (0, In) and 0 outside this range, 

*2jt 


cos (<o c t + @) 


-f 

2n Jo 


cos (a) c t + $) d& = 0 


Hence, 


x (0 — 0 


(9 7a) 


Thus, the ensemble mean of sample function amplitudes at any instant t is zero. 

The autocorrelation function R x (t i, ( 2 ) for this process also can be determined directly 
from Eq. (93a), 


R\(tu 1 2 ) - A 2 cos (co c ti H- 0}cos (a> c t 2 -h ©) 


= A cos + 0) cos {oj c t 2 + 0) 


A 2 - -—- \ 

— — COS [ov(f 2 — ^l)] -h cos [(o c (t2 H- f 1 ) -j- 20] j 


The lirstterm on the right-hand side contains no RV. Hence, cos[av(f 2 — q)] is cos 
t\)] itself. The second term is a function of the uniform RV 0, and its mean is 


cos [o) c (t 2 4- h) + 20] 


— 1 f 2 * 


cos \_0J c {t2 H- fi) -h 2 0) dO = 0 


Hence, 


or 


t 2 ) = — cos [<0 c (t 2 - / 1 )] 


A 2 

R x (j) = — cos co c z z — t 2 — t\ 


(93b) 


(9.7c) 


From Eqs. (9.7a) and (9.7b) it is dear that x(f) is a wide-sense stationary process. 


Ergodic Wide-Sense Stationary Processes 

We have studied the mean and t he a utocorrelation function of a random process. These are 
ensemble averages. For example, x(0 is the ensemble average of sample function amplitudes at 
r, and ti) — X 1 X 2 is the ensemble average of the product of sample function amplitudes 
x(fi) and x(* 2 )- 

We can also define time averages for each sample function. For example, a time mean Jt(f) 
of a sample function x(t) is* 


i pT/2 

x(f)= lim - / x{t)dt 
T—*oq 1 J-r/2 


(9.8a) 


Here a sample function x(t< £,■) is represented by jc(f) for convenience. 



464 RANDOM PROCESSES AND SPECTRAL ANALYSIS 


Similarly, the time autocorrelation function 7 ^ x (t) defined in Eq. (3.82b) is 

7?. x (f) = *(()*(* + t)= lim 3 / x(t)x(t + r) dt 
T —-a / J—T/2 


(9,8b) 


For ergodie (wide-sense) stationary processes, ensemble averages are equal to the time 
averages of any sample function. Thus, for an ergodie process x(0. 


(9.9a) 
(9.9b) 

These are the two averages for ergodie wide-sense stationary processes. For the broader def¬ 
inition of an ergodie process, all possible ensemble averages are equal to the corresponding 
time averages of one of its sample functions. Figure 9.6 illustrates the relationship among 
different classes of (ergodie) processes. In the coverage of this book, our focus lies in the class 
of ergodie wide-sense stationary processes. 

It is difficult to test whether a process is ergodie or not, because we must test all possi¬ 
ble orders of time and ensemble averages. Nevertheless, in practice many of the stationary 
processes are ergodie with respect to at least low-order statistics, such as the mean and the 
autocorrelation. For the process in Example 9.1 (Fig. 9.5), we can show that x(t) = 0 and 
7 £x(t) = (A 2 j2) cos a\-x (see Prob. 3.8-1). Therefore, this process is ergodie at least with 
respect to the first- and second-order averages. 

The ergodicity concept can be explained by a simple example of traffic lights in a city. 
Suppose the city is well planned, with all its streets in E-W and N-S directions only and with 
traffic lights at each intersection. Assume that each light stays green for 0.75 second in the E-W 
direction and 0.25 second in the N-S direction and that switching of any light is independent 
of the other lights. For the sake of simplicity, we ignore the orange light. 

Tf w ? e consider a certain person driving a car arriving at any traffic light randomly in the 
E-W direction, the probability that the person will have a green light is 0.75; that is, on the 
average, 75% of the time the person wdll observe a green light. On the other hand, if we consider 
a large number of drivers arriving at a traffic light in the E-W direction at some instant f, then 
75% of the drivers will have a green light, and the remaining 25% will have a red light Thus, 
the experience of a single driver arriving randomly many times at a traffic light will contain 
the same statistical information (sample function statistics) as that of a large number of drivers 
arriving simultaneously at various traffic lights (ensemble statistics) at one instant. 

The ergodicity notion is extremely important because we do not have a large number of 
sample functions available in practice from which to compute ensemble averages. If the process 
is known to be ergodie, then we need only one sample function to compute ensemble averages. 
As mentioned earlier, many of the stationary processes encountered in practice are ergodie with 


x(0 = x(t) 
R x (j) = U x (r) 


Figure 9.6 

Classification of 

random 

processes. 
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respect to at least second-order averages. As we shall see in dealing with stationary processes 
in conjunction with linear systems, we need only the first- and second-order averages. This 
means that in most cases we can get by with a single sample function, as is often the case in 
practice. 


9.3 POWER SPECTRAL DENSITY 

An electrical engineer instinctively thinks of signals and linear systems in terms of their fre¬ 
quency domain descriptions. Linear systems are characterized by their frequency response (the 
transfer function), and signals are expressed in terms of the relative amplitudes and phases of 
their frequency components (the Fourier transform). From a knowledge of the input spectrum 
and transfer function, the response of a linear system to a given signal can be obtained in 
terms of the frequency content of that signal. This is an important analytical procedure for 
deterministic signals. We may wonder if similar methods may be found for random processes, 
ideally, all the sample functions of a random process are assumed to exist over the entire time 
interval (—oo, oo) and, thus, are power signals,* We therefore inquire about the existence 
of a power spectral density (PSD). Superficially, the concept of a random process having a 
PSD may appear ridiculous for the following reasons. In the first place, w ? e may not be able 
to describe a sample function analytically. Second, for a given process, every sample function 
may be different from another one. Hence, even if a PSD does exist for each sample function, 
it may be different for different sample functions. Fortunately, both problems can be neatly 
resolved, and it is possible to define a meaningful PSD for a stationary (at least in the wide 
sense) random process. For nonstationary processes, the PSD may not exist. 

Whenever randomness is involved, our inquiries can at best provide answers in terms of 
averages. When tossing a coin, for instance, the most we can say about the outcome is that 
on the average we will obtain heads in about half the trials and tails in the remaining half 
of the trials. For random signals or RVs, we do not have enough information to predict the 
outcome with certainty, and we must accept answers in terms of averages. It is not possible 
to transcend this limit of knowledge because of our fundamental ignorance of the process. It 
seems reasonable to define the PSD of a random process as a weighted mean of the PSDs of 
all sample functions. This is the only sensible solution, since we do not know exactly which of 
the sample functions may occur in a given trial. We must be prepared for any sample function. 
Consider, for example, the problem of filtering a certain random process. We would not want 
to design a filter with respect to any one particular sample function because any of the sample 
functions in the ensemble may be present at the input. A sensible approach is to design the filter 
with respect to the mean parameters of the input process. In designing a system to perform 
certain operations, one must design it with respect to the whole ensemble. We are therefore 
justified in defining the PSD S x (f) of a random process x(f) as the ensemble average of the 
PSDs of all sample functions. Thus [see Eq, (3.80)], 


SAf) = 


lim 


\Xj(f)\ 2 

T 


W/H z 


where X T (f) is the Fourier transform of the time-truncated random process 


xr( o = x(o n(f/r) 


(9.10a) 


* As we shall soon see, for the PSD to exist, the process must be stationary (at least in the wide sense). Stationary 
processes, because their statistics do not change with time, are power signals. 
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and the bar atop represents ensemble average, Note that ensemble averaging is done before 
the limiting operation. We shall now show that the PSD as defined in Eq. (9.10a) is the Fourier 
transform of the autocorrelation function (t) of the process x(r); that is, 

S x {f) (9.10b) 


This can be proved as follows: 


/ oo fT/2 

x T ( t ) e- J2,rft dt = / x(f)«-W<ir 

-oo J-T/2 


(9.11) 


Thus, for real x(/), 


\X T (f )\ 2 = X T (-f)X T (f) 

« T /2 


= / dri / \{t2)e~ j27lft2 dti 

J-T/2 J-T/2 

fT /2 f T /2 

= I I x(f])x(f2)e }2lYf(il ~ u) dtydti 
J-T/ 2 J-T /2 


and 


S*(f)= lim 

T -* 00 


i x T (f)\ 2 

T 


— lim 

T-*q o 


l fT/2 f T/2 

- / x ( t ] ) x ( t 2) e - j 2 jl f < t i - r i > dt \ dt 2 

1 J - T /2 J - T /2 


(9.12) 


Interchanging the operation of integration and ensemble averaging,* we get 


lim 

1 

rTf 2 

fT /2 

/ x(f i )x(r 2 )e~ J,2;r/(,2 “ ,|) ^i^2 

T -* o 0 

T * 

f-T/2 

'- T /2 


1 

fT /2 

fT /2 

lim 


/ R *{ t 2- t \) e ~ W ^-^ di \ dt 2 

T-* 00 

T .j 

f - T /2 J 

J-j/2 


Here we are assuming that the process x(0 is at least wide-sense stationary, so that xffOxfo) = 
Rx(ti — fi)- For convenience, let 

rtxta - ti)e~ j2!rf(,2 - !l) = <p(t 2 - (i) (9.13) 

Then, 

j fT/2 fT/2 

S x (f)= lim-/ / (p(t 2 -ti)dt]dt 2 (9.14) 

r-^oo I J-T/2 J-T/2 


* The operation of ensemble averaging is also an operation of integration. Hence, interchanging integration with 
ensemble averaging is equivalent to interchanging the order of integration. 
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Figure 9.7 

Derivation of the 
Wiener- 
Khintchme 
theorem. 



The integral on the right-hand side is a double integral over the range (— T /2, Tf 2) for each 
of the variables t\ and fy. The square region of integration in the t\-t 2 plane is shown in 
Fig. 9*7. The integral in Eq. (9,14) is a volume under the surface <p(t 2 — fj) over the square 
region in Fig. 9.7. The double integral in Eq, (9,14) can be converted to a single integral by 
observing that <p(t 2 - t\ ) is constant along any line - t\ — z (a constant) in the t\-t 2 plane 
(Fig. 9.7). 

Let us consider two such lines, t 2 — t\ — r andf 2 — L — t-P At. If At 0, (p(t 2 — t\) 2 ^ 
<p(z) over the shaded region whose area is (T — r) At. Hence, the volume under the surface 
<p(t 2 — fi) over the shaded region is <p(r)(T — r) At. If t were negative, the volume would be 
(p(z)(T + t)At. Hence, in general, the volume over the shaded region is <p(z)(T — |r|) At. 
The desired volume over the square region in Fig. 9.7 is the sum of the volumes over the shaded 
strips and is obtained by integrating <p(t)(T — |r |) over the range of r, which is (—7\ T) (see 
Fig. 9.7). Hence, 


w)=^ T 


- M)dr 


= lim 

T-hx) 


/>(-¥) 


dr 


-i: 


(p(T)dr 


provided \z\<p(z)dz is bounded. Substituting Eq. (9.13) into this equation, we have 

S x (f) = H RAT)e~ j2 * fz dz 

J -OC 


(9.15) 



468 


RANDOM PROCESSES AND SPECTRAL ANALYSIS 


provided |r|/? x (r)e j27tJ r dz is bounded. Thus, the PSD of a wide-sense stationary 
random process is the Fourier transform of its autocorrelation function,* 

«=> S x (f) (9.16) 

This is the well-known Wiener-Khintchine theorem, first presented in Chapter 3. 

From the discussion thus far, the autocorrelation function emerges as one of the most 
significant entities in the spectral analysis of a random process, Earlier we showed heuristically 
how the autocorrelation function is connected with the frequency content of a random process. 

The autocorrelation function /f x {r) for real processes is an even function of r. This can 
be proved in two ways. First, because \X T {f)\ 2 = \X r (f)Xf(f)\ = \X T <f)X T (-f )\is an even 
function of/, S x (f ) is also an even function off , and its inverse transform, is also an 

even function of z (see Prob. 3.1-1). Alternately, we may argue that 


^x(t) = x(t)x(f 4- r) and /? x (-r) = x(t)x(r - t) 
Letting / - r = a, we have 


flx(-T) = x(<j)x(<T + r) = /?x(t) 


The PSD S x (f) is also a real and even function off. 

The mean square value x 2 (f) of the random process x(t) is /? x {0), 

R x (0) = = x 2 (tj = ^ 


(9.17) 


(9-18) 


The mean square value x 2 is not the time mean square of a sample function but the ensemble 
average of the squares of all sample function amplitudes at any instant t. 


The Power of a Random Process 


The power P x (average power) of a wide-sense random process xi'r) is its mean square value 
x 2 . From Eq. (9.16), 


**(T) 


S x (f)e> 2 * fT 


df 


Hence, from Eq. (9.18), 


P x=x 2 = R ^ o ) = 





Because S x (f) is an even function of/, we have 


(9.19a) 


= x 2 = 2 



(9.19b) 


where/ is the frequency in hertz * This is the same relationship as that derived for deterministic 
signals in Chapter 3 [Eq. (3*81)]* The power P x is the area under the PSD. Also, P x — x 2 is 
the ensemble mean of the square amplitudes of the sample functions at any instant* 


* It can b e shown that E q. (9.15) holds also for complex random processes* for which we define 
R x (t) = x*(t)x(t + t). 
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It is helpful to repeat here, once again, that the PSD may not exist for processes that are 
not wide-sense stationary. Hence, in our future discussion, random processes will be assumed 
to be at least wide-sense stationary unless specifically stated otherwise. 


Example 9.2 Determine the autocorrelation function /? x (t) and the power/\ of a low-pass random process 
with a white noise PSD S x (f ) = J\ff 2 (Fig* 9*8a). 


Figure 9.8 

Bandpass white 
noise PSD and its 
autocorrelation 
function. 


(a) 




% We have 



Hence, from Table 3*1 (pair 18), 

tf A (r) = Arsine (IttBt) 
This is shown in Fig. 9.8b. Also, 


Alternately 


P x =^=R x (0)=.M'B 


Px 


2 / 

JO 



(9.20a) 


(9.20b) 


(9.20c) 


= A fB 


(9.20d) 
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Example 9.3 

Determine the PSD and the mean square value of a random process 



x(r) = A cos (co c t + 0) 

where © is an RV uniformly distributed over (0, 2 jt). 

I For this case R x ( r) is already determined [Eq. (9 + 7c)], 

I o & 

(9.21a) 


| /t x (r) — — cos co c r 

I Hence, 

I „ „ a* 

(9.21b) 


| Sr(f) = -m+f c ) + S(f~f c )] 

(9.21c) 


| P x =? = **(()) = y 

(9.2 Id) 

1 

j Thus, the power, or the mean square value, of the process x(r) = A cos + 0) isA 2 /2 t 

S The power P x can also be obtained by integrating S x (f) with respect to/. 


Example 9.4 Amplitude Modulation 

Determine the autocorrelation function and the PSD of the DSB-SC-modulated process 
m(f)cos {co c t -H ©), where m(f) is a wide-sense stationary random process, and 0 is an 
RV uniformly distributed over (0, 2n) and independent of m(r). 


Let 

Then 


(p(r) = m(t) cos (w c t -j- ©) 


if^O) = m(f)cos (a> c t + 0) ■ m(f + r)cos [a> c (t + t) + 0] 

Because m(r) and © are independent, we can write [see Eqs. (8.64b) and (9.7c)] 

r) — + r) cos (w c r + ©)cos [<o c (t -h r) + 0] 

1 

= -R m (r)cos (9.22a) 

Consequently/ 

■W) = +/,) + S m (f -/,)] (9.22b) 

From Eq, (9 + 22a) it follows that 

V>H 0 = V°> = = ^i^(r) (9.22c) 


We obtain the same result even if >(0 = m(0sin (w r f + 0). 
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Hence, the power of the DSB-SC-modulated signal is half the power of the modulating 
signal. We derived the same result earlier [Eq. (3.93)] for deterministic signals. 


We note that, without the random phase 0, a DSB-SC amplitude-modulated signal 
m(f) cos (£u f r) is in fact not wide-sense stationary. To find its PSD. we can resort to the 
time autocorrelation concept of Chapter 3. 


Example 9.5 Random Binary Process 


In this example we shall consider a random binary process for which a typical sample 
function is shown in Fig, 9.9a, The signal can assume only two states (values), 1 or —1, 
with equal probability. The transition from one state to another can take place only at node 
points, which occur every T b seconds. The probability of a transition from one state to the 
other is 0.5. The first node is equally likely to be situated at any instant within the interval 
0 to T b from the origin. Analytically, we can represent \{t) as 


Figure 9.9 

Derivation of 
autocorrelation 
function and PSD 
of a random 
binary process. 



where a is an RV uniformly distributed over the range (0, T b ) and p(t) is the basic pulse 
(in this case FI[(f — T b /2)/T b ]). Note that a is the distance of the first node from the 
origin, and it varies randomly from sample function to sample function. In addition, a„ is 
random, taking values 1 or — 1 with equal probability. The amplitudes at r represent RV xi, 
and those at/-hr represent RVx 2 . Note that xi and X 2 are discrete and each can assume only 
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two values, —1 and 1. Hence, 

flx(r) = MX? = EE X2) 

X] X 2 

= *W1< 0 +/’x,x 2 (-l, -1) -/■»!*,(-1, 1) -/\ lJl2 (l, -I) (9.23a) 

By symmetry, the first two terms and the last two terms on the right-hand side are equal. 
Therefore, 

/? x (r) = 2[/\ ]X ,(l, 1) -P Kl x 2 (U -0] (9.23b) 

From Bayes' rule, we have 


Moreover, 


Hence, 


/?x(r) = 2/> Xi (l)[/> X2(Xj (l|l)^/>, 2|Xl (-l|l)] 

= ^ 2 |x l (Hl)-^ 2f x 1 (-l|l) 
^ l x i a|l) = l-/ 5 x , l x 1 (-l|0 


(9,23c) 


Rx(j) = l — 2/\ 2 | Xl (~1|1) 

It is helpful to compute R x (r) for small values of r first. Let us consider the case of r < T b , 
where, at most, one node is in the interval t to t + r. In this case, the event x 3 = -1 given 
X] = 1 is a joint event A fi Z?, where the event A is “a node in the interval (r, t 4- r) ,? and 
B is “the state change at this node.” Because A and B are independent events, 

^x 2 |x t (— Ml) = P(a node lies in t to 1 4- restate change) 

= ^(a node lies in t to t + r) 

Figure 9.9b shows adjacent nodes /ii and«2* between which t lies. We mark off the interval 
x from the node If f lies anywhere in this interval (sawtooth line), the node n 2 lies 
within x and t + r. But because the instant t is chosen arbitrarily between nodes n\ and 
ri 2 , it is equally likely to be at any instant over the T% seconds between n\ and n 2 , and the 
probability that t lies in the shaded interval is simply xjT b . Therefore, 


^[X] * 11) 


■Ks) 


«*(T) = 1 " =■ 


T < T), 


Because R\ (r'i is an even function of r, we have 


/?x( t) = 1 - Ei 


Next, consider the range r > T 0 . In this case at least one node lies in the interval / to r — r . 
# Hence, xi and X 2 become independent, and 


R^{t) = X 1 X 2 = xi x 2 = 0 



9,3 Power Spectral Density 473 


^ where, by inspection, we observe that xi = £2 = 0 (Fig. 9.9a). This result can also be 
obtained by observing that for|r] > T&, x ] and X2 are independent, and it is equally likely 
that *2 = 1 or — 1 given that X] — 1 (or —1). Hence, all four probabilities in Eq. (9.23a) 
f are equal to 1 /4, and 

\ RJ t) = 0 r > T b 


it Therefore, 


K 

B. 




1 - Ir| /T b |r| < T b 
0 \r\ > T b 


(9.27a) 


and 


f. 


s*if) = Tb sine 2 (nfT b ) 


(9.27b) 


The autocorrelation function and the PSD of this process are shown in Fig. 9.9c and <L 
n Observe that x 2 — /? x (0) = 1, as expected 


The random binary process described in Example 9.5 is sometimes known as the telegraph 
signal* This process also coincides with the polar signaling of Sec* 7.2.2 when the pulse shape 
is a rectangular NRZ pulse (Fig. 7.2). For wide-sense stationarity, the signal’s initial starting 
point a is randomly distributed. 

Let us now consider a more general case of the pulse train y(t), discussed in Sec* 7*2 
(Fig* 7*4), From the knowledge of the PSD of this train, we can derive the PSD of on-off, 
polar, bipolar, duobinary, split-phase, and many more important digital signals. 


Example 9.6 Random PAM Pulse Train 

Digital data is transmitted by using a basic pulse p(t), as shown in Fig* 9.10a* The successive 
pulses are separated by Tb seconds, and the kih pulse is a^pW, where at is an RV* The distance 
a of the first pulse (corresponding to k = 0) from the origin is equally likely to be any value in 
the range (0, Tt)+ Find the autocorrelation function and the PSD of such a random pulse train 
y(0 whose sample function is shown in Fig. 9* 10b. The random process y(t) can be described 
as 


y(f) = 53 - kTh ~ ^ 

k=—oo 

where a is an RV uniformly distributed in the interval (0, T (,). Thus, a is different for each 
sample function* Not e tha t p(ct) = 1 /Tj, over the interval (0, 7/,) and is zero everywhere else* 
It can be shown that y(f) = (a] \/Tt) f^p(t) & is a constant. 1 


* If cf = 0, the process can be expressed as y(r) — ^kP( l - kTh), In this ease 

y(r) = ajf — kTh) ^ n0t constant, hut is periodic with period T Similarly* we can show that the 

autocorrelation function is periodic with the same period 7^. This is an example of a cyclostationary, or periodically 
stationary, process (a process whose statistics are invariant to a shift of the time origin by integral multiples of a 
constant 7^). Cyclostationary processes, as seen here* are clearly not wide-sense stationary. But they can be made 
wide-sense stationary with slight modification by adding the RV a in the expression of y(0* its in this example. 
f Using exactly the same approach, as seen shortly in the derivation of Eq. (9.28), we can show that 
W) = &k/T b )f? 00 P{t)ch. 
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Figure 9,10 

Random PAM 
process. 




0 

7i r-* 


(a) 





We have the expression 


/f y (r) = y(t)y(t + r) 

™ ~ ~ 00 

= ^ - kT b - a) ^2 a „,p{t 4- t - m7j, - a) 

Jt——oo «i™—nc 

oc oo 

~ ^ ^ - £7), - a)p(t + r - mT b - a) 

Jt= —OG «*= —00 

Because a* and a m are independent of a, 


00 oc 

Ry(r) = X] S - kT b ~ «) ■ p(f + r - mT b - a) 

£=-00 m =- 00 


Both & and m are integers. Letting m = k + rc, this expression can be written 


00 00 

Ry(j) = ^ ^2 &k&k+n + pU ~ kTb - &) * pti + t - [£ -h n]Tfr - a) 

k=- oo ra=-oo 


The first term under the double sum is the correlation of RVs a^ and ajt +n and will be 
denoted by 7Z n . The second term, being a mean with respect to the RV a , can be expressed 
as an integral. Thus, 


I OC oo 

fi y( T ) = X! X] L _ “)/></ + T - [i + «]Tf, - a }/>(<*) rfo 
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Recall that a is uniformly distributed over the interval 0 to T&. Hence, p(a) = 1/7& over 
the interval (0, Th), and is zero otherwise. Therefore, 


Ry(r) 


1 fib 

= zl t~ / ~ kTh ~ + T - [fc + n i T b - Of) rfa 

l 00 00 ff-AT* 

= T E K,t E / p(P)p(P + T -nT b )dp 

Ib n=-00 

^ °° f 00 

= =r £ M Ptf)p(^ + r-»7b)^ 

7-oo 


The integral on the right-hand side is the time autocorrelation function of the pulse p(t) 
with the argument x - nT b . Thus, 


Ry(t) = JT 'jT Tin - nTb) 


where 


and 


Tin = akty+n 


(9.28) 


Mr) = 


p(t)p(t + z)dt 


(9.29) 


(9.30) 


As seen in Eq* (3*74), if p(t) P{f ), then ^(r) |/ > (/ v )l 2 * Therefore, the PSD of 

y(f), which is the Fourier transform of /? y (r), is given by 


1 50 

S y (0 = — £ K n \P<f)\ 2 e-J n2 *fo 

QO 

^ K n e~i n2nfTb 


T h 

\P(f) \ 

T b 


fl = —tX) 

2 00 


(9.31) 


This result is similar to that found in Eq. (7.11b). The only difference is the use of the 
ensemble average in defining TL n in this chapter, whereas R n in Chapter 7 is the time 
average. 


Example 9 7 Find the PSD S y (f ) for a polar binary random signal where 1 is transmitted by a pulse p(t) 
(Fig* 9.11) whose Fourier transform is P(f ), and 0 is transmitted by —p(t)> The digits 1 and 0 
are equally likely, and one digit is transmitted every Th seconds. Each digit is independent of 
the other digits* 
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Figure 9+11 

Basic pulse for o 
random binary 
pofar signal. 
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pit) 
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0 7 

l * 


4 4 


In this case, a* can take on values 1 and —1 with probability 1 /2 each. Hence, 


a*= £ a jt P(a i ) = (l)P ilt (l) + (-l)P a( .{-l) 

*=d,-l 
1 1 

= 2~2=° 

Ko = 4= £ ^(at) = d) 2 ^(1)+ (-D%(-l) 

= 1(1) 2 + 1(-1) 2 = 1 

and because each digit is independent of the remaining digits, 

= <tjt &k-\-n ™ 0 ^ > 1 


Hence, from Eq, (931), 


w = 


\p(f)\ 2 

T h 


We already found this result in Eq. (7,13), where we used time averaging instead of 
ensemble averaging. When a process is ergodic of second order (or higher), the ensemble 
and time averages yield the same result. Note that Example 9.5 is a special case of this 
result, where p(t) is a full-width rectangular pulse II(f/7i) with P(f ) = T b sine (JtJT b ), 
and 


w = 


\p(f)\ 2 

T b 


Tb sine 2 (j rJTb) 


Example 9.8 Find the PSD Sy (f) for on-off and bipolar random signals which use a basic pulse for p{t), as 
shown in Fig. 9.11. The digits 1 and 0 are equally likely, and digits are transmitted every T b 
seconds. Each digit is independent of the remaining digits. All these line codes are described 
in Sec, 7,2. 
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In each case we shall first determine IZq, H\ , IZ 2 , ,.., H n . 

(a) On-off signaling: In this case, a rt can take on values l and 0 with probability 1 /2 each. 
Hence, 


a* = (D^d) + «))/>* <0) = l(i) +1(0) = 1 
n 0 = 4 = (D^d) + (0)^(0) = l(i ) 2 +1(0 ) 2 = 1 

and because each digit is independent of the remaining digits, 

n n = a*a* +M = a* a* +n ^2) (2) ~ 4 n — ^ 
Therefore, from Eq, (9.31), 


Sy(f) = 


\P(f)\ 2 

n 


1 1 

_ + L v f >-j n2jT f r >’ 

2 4^ 


h^o 


I P(f)\ 2 

n 


1 1 00 

; + j £ 

n——OQ 


(9.32a) 


(932b) 


Equation (9.32b) is obtained from Eq. (9.32a) by splitting the term 1 /2 corresponding 
to Hq into two: 1/4 outside the summation and 1/4 inside the summation (corre¬ 
sponding to n = 0). This result is identical to Eq, (7.18b) found earlier by using time 
averages. 

We now use a Poisson summation formula,* 


£ 


e ~jn2jzfT b 


n=-oo 



Substitution of this result into Eq. (932b) yields 


Sy(f) 



(9.32c) 


Note that the spectrum S y if) consists of both a discrete and a continuous part. A 
discrete component of clock frequency (Rb = 1/7),) is present in the spectrum. 
The continuous component of the spectrum is \P(f)\ 2 /4Tb is identical (except for a 
scaling factor 1/4) to the spectrum of the polar signal in Example 9,7. This is a logi¬ 
cal result because as Fig. 73 shows, an on-off signal can be expressed as a sum of a 


* The impulse train in Fig. 3.23a is 5^(0 can be expressed as 3^(0 = && ~ nT b)' Also 

S(t - nTfr) o Hence, the Fourier transform of this impulse train is But we found the 

alternate form of the Fourier transform of this train in Eq. (3.43) (Example 3.11). Hence, 

oc , oo 

y' = _L y s | 

^ T h^ 


■K) 
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polar and a periodic component. The polar component is exactly half the polar signal 
discussed earlier. Hence, the PSD of this component is one-fourth of the PSD of the 
polar signal. The periodic component is of clock frequency R b , and consists of discrete 
components of frequency Rh and its harmonics. 

(b) Bipolar signaling; In this case, a* can take on values 0, 1, and -1 with probabilities 
1/2, 1/4, and 1/4, respectively. Hence, 

a* = (0)P ai (0) + (1)/VU + (-l)^(-l) 

= i(0) + ^(l) + i(-l) = 0 


fto = a? = (0)%(0) + (1)%(1) + (-)) 2 /> a . (-1) 


) 2 /V0)-' /1v2j 
,2 1 .-^ 1 


1 , 1,1 ,1 

(°) 2 + ( 1 ) 2 + (_ 1)2 

2 4 4 2 


Also, 


=a*aJt+i = 

k * + 1 

Because a k and a k+] can take three values each, the sum on the right-hand side has 
nine terms, of which only four terms (corresponding to values ±1 for a k and %+]) 
are nonzero. Thus, 

tti =U)a)^ +l (1, 04-(-l)(l)^+,(-U 1) 

+ (1)(-l)^ +l d, -l) + (-I)(-I)P^ +l (-l, -1) 

Because of the bipolar rule, 

D = ^a,+ l (-k -1) = 0 

and 

*W,(-1. 1) = Pa,(-D/ > a, + 1 | at (i| - 1) = l 

Similarly, we find P^ k+l (1, —1) = 1/8* Substitution of these values in 1Z\ yields 

For n > 2, the pulse strengths a* and a* + i become independent. Hence, 

T^n = a*a*+n = At afr + „ = (0)(0) = 0 n > 2 

Substitution of these values in Eq. (9.31) and noting that 7v.. is an even function of n, 
yields 


Sy(f) = 


\P(f)\ 2 

T h 


sin- (nfTb) 


This result is identical to Eq. (7.21b) found earlier by using time averages. 
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9.4 MULTIPLE RANDOM PROCESSES 

For two real random processes x(0 and y(r), we define the cross-correlation function* 
ti) as 


h) = x(t[)y(t 2 ) (9.33a) 

The two processes are said to be jointly stationary (in the wide sense) if each of the 
processes is individually wide-sense stationary and if 


^xy(h> £l) — ^xy(^2 — h) 

= RjyW (9.33b) 

Uncorrelated, Orthogonal (Incoherent), and Independent Processes 

Two processes x(r) and y (t) are said to be uncorrelated if their cross-correlation function is 
equal to the product of their means; that is, 

R xy (r) = x(r)y(f + r) = xy (9,34) 

This implies that RVs x(r) and y(f + r) are uncorrelated for all t and r. 

Processes x(f) and y(r) are said to be incoherent, or orthogonal, if 

* X y(r) = 0 (9.35) 


Incoherent, or orthogonal, processes are uncorrelated processes with x and/or y — 0, 

Processes x(r) and y(t) are independent random processes if the random variables x(fi) 
and y(/ 2 ) are independent for all possible choices of t\ and f 2 , 


Cross-Power Spectral Density 

We define the cross-power spectral density S xy (f) for two random processes x(t) and y (f) as 


Sjytf) 


lim 

T—too 


X* T (f)Y T if) 

T 


(936) 


where Xj{f) and Yr(f) are the Fourier transforms of the truncated processes x(t) U(t/T) and 
y (t) n (t/T), respectively. Proceeding along the lines of the derivation of Eq, (9,16), it can be 
shown that t 


R\y(j) <=> S*y(f) 

It can be seen from Eqs, (933) that for real random processes x(t) and y(f), 

/? xy (r) = /? yx (-r) 


Therefore, 


$xy(f) = S yx {-f) 


(937a) 

(9.37b) 


(9.37c) 


+ For complex random processes, the cross-correlation function is defined as 

tfxyftl, t 2 ) =**{/])> L fo) 

f Equation (9.37a) is valid for complex processes as well. 
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Figure 9+12 

Transmission of a 
random process 
through a linear 
time-invariant 
system. 
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9.5 TRANSMISSION OF RANDOM PROCESSES 
THROUGH LINEAR SYSTEMS 

If a random process x(t) is applied at the input of a stable linear lime-invariant system (Fig. 9.12) 
with transfer function Hif) y we can determine the autocorrelation function and the PSD of the 
output process y(/)> We now show that 


and 


Ry(T) = h{ r) */t(-r) *A x (t) 

(9.38) 

Sy(f) = \H(f)\ 2 S*(f) 

(9.39) 


To prove this, we observe that 


and 

Hence,* 


y(0 = 

y(t + r) = 



h(a)x(t - a) da 


h(a)x(t + z — a)da 


/ OO flOC- 

h(a)x(t - ct) da I + r ~ fi) dfi 

-oo J-oo 

/ OO pOQ _ 

/ h(a)h(P)x(t — a)x(t -hr — ft) dad ft 

■oc J—oo 

/ OC- pOO 

f h(a)h(f})Ri(z + a - ft) da dp 

-oo J-OC 


This double integral is precisely the double convolution fr(r)*/t(-r)*/? x (t). Hence, Eqs, (938) 
and (939) follow. 


Example 9,9 Thermal Noise 

Random thermal motion of electrons in a resistor/? causes a random voltage across its terminals. 
This voltage n(r) is known as the thermal noise. Its PSD S n (f ) is practically flat over a very 
large band (up to 1000 GHz at room temperature) and is given by 1 

S n tf) = 2kTR (9.40) 


* in this development, we interchange the operations of averaging and integrating. Because averaging is really an 
operation of integration, we are really changing the order of integration, and we assume that such a change is 
permissible. 
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Figure 9.13 

Thermal noise 
representation in 
a resistor. 



where k is the Boltzmann constant (L3S x 10 -23 ) and T is the ambient temperature inkeivins. 
A resistor R at a temperature T kelvin can be represented by a noiseless resistor R in series 
with a random white-noise voltage source (thermal noise) having a PSD of 2kTR (Fig. 9.13a). 
Observe that the thermal noise power over a band Af is (2kTR) 2Af = AkTRAf. 


Let us calculate the thermal noise voltage (rms value) across the simple RC circuit in 
Fig. 9.13b. The resistor R is replaced by an equivalent noiseless resistor in series with 
the thermal noise voltage source. The transfer function H(f) relating the voltage at 
terminals a-h to the thermal noise voltage is given by 


n<f) 


X/jlizfC 


1 


R+l/j2nfC 1 -\~j2nfRC 
If So(f) is the PSD of the voltage v 0t then from Eq. (9.39) we have 

|2 


W) = 


1 


1 +j2nfRC 
2 kTR 


2kTR 


1 +4 7t 2 f 2 R 2 C 2 


The mean square value vj is given by 




2k.TR 


I + 4x 2 f 2 R 2 C 2 


4f 


kT 

~C 


Hence, the rms thermal noise voltage across the capacitor is ^/kT/C. 


(9.41) 


Sum of Random Processes 

If two stationary processes (at least in the wide sense) x(r) and y(f) are added to form a process 
z(t), the statistics of z(r) can be determined in terms of those of x(f) and y i'/j . if 

z(0 = x(() + y(f) (9.42a) 


then 


^i(f) = z(r)z(r + t) = [x(f) + y(r)l[x(t + r) +■ y(f 4- r)] 

= A*(t) +f? y (r) + /f xy (r) +/?yx(f) 


(9.42b) 
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If x(0 and y(f) are uneonrelated, then from Eq* (934), 


Rxy{ r) = /fyx(r) = xy 


and 


Rz<J) = Rx(t) + /? y (r) + 2xy (9.43) 

Most processes of interest in communication problems have zero means. If processes x (r) and 
y(0 are uncorrelated with either x ory = 0 [i.e„ if x(r) and y(0 are incoherent], then 


R z (t) = J? x (r) +fiy(r) 

(9.44a) 

and 


s,(f) = s,(f) + s y (f) 

(9,44b) 

It also follows from Eqs. (9.44a) and (9.19) that 


z 2 = x 2 + y 2 

(9.44c) 


Hence, the mean square of a sum of incoherent (or orthogonal) processes is equal to the sum 
of the mean squares of these processes. 


Example 9.10 Two independent random voltage processes X] (r) and X 2 (f) are applied to an RC network, as 
shown in Fig, 9.14. It is given that 


S Xi (f)=K S X2 (f) = 


2a 


a 2 + (2 itf) 2 



Determine the PSD and the power P y of the output random process y(f). Assume that the 
resistors in the circuit contribute negligible thermal noise (he., assume that they are noiseless). 

Because the network is linear, the output voltage y(f) can be expressed as 

y(0 = Yi(0 + Y2(0 

where y [ (r) is the output from input X]{f) [assuming X 2 (t) — 0] and y2(0 is the output 
from input *2(0 [assuming xi(f) = 0]. The transfer functions relating y(r) to Xj(f) and 
X 2 (f) are H[(f ) and Hzif), respectively, given by 

Hi (O = T 7 7"-o = ^ r ■ 

3(3 ']2nf + 1) 2(3 jfrif + 1) 
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Hence, 




K 

9[9(2tt/) 2 + lj 


and 

W = iw = 5p ^y ) , + 1 *|^ + w) , 1 

Because the input processes x/f) and X 2 (f) are independent, the outputs yi(r) and y 2 (f) 
generated by them will also be independent. Also, the PSDs of yi (t) and y 2 (t) have no 
impulses at/ =0, implying that they have no dc components [i.e., yi ( t) = y 2 (f) = 0]. 
Hence, yi (f) and yi(f) are incoherent, and 


S y (f) = S Vl (f) + %(/’) 

2K[a 2 + (2jtf) 2 ] + 9a 
~ 18[9(2,t/) 2 + l][a 2 4- {2?r/) 2 ] 


The power P y (or the mean square value y 2 ) can be determined in two ways. We can find 
Ryir) by taking the inverse transforms of S yi !/) and S yi (f) as 


*y<T) 


-lrl/3 


54 




3a - 

+ 4(9a 2 - 1) 

'-,- 

Ry 2 (T) 


and 



3a - 1 
4(9of 2 - 1) 


Alternatively, we can determine y 2 by integrating S y (f) with respect to/ (or/) [see 
Eq. (9.19)]. 


9.6 APPLICATION: OPTIMUM FILTERING 
(WIENER-HOPF FILTER) 


When a desired signal is mixed with noise, the SNR can be improved by passing it through a 
filter that suppresses frequency components where the signal is weak but the noise is strong. 
The SNR improvement in this case can be explained qualitatively by considering a case of 
white noise mixed with a signal m (/) whose PSD decreases at high frequencies. If the filter 
attenuates higher frequencies more, the signal will be reduced—in fact, distorted. The distor¬ 
tion component m € (t) may be considered as bad as added noise. Thus, attenuation of higher 
frequencies will cause additional noise (from signal distortion), but, in compensation, it will 
reduce the channel noise, which is strong at high frequencies. Because at higher frequencies 
the signal has a small power content, the distortion component will be small in comparison to 
the reduction in channel noise, and the total distortion may be smaller than before. 

Let be the optimum filter (Fig. 9.15a). This filter, not being ideal, will cause signal 

distortion. The distortion signal m*(r) can be found from Fig. 9.15b. The distortion signal 
power No appearing at the output is given by 


/ oo 

S m (f)\H QVt (f)-\\ 2 df 

-OO 
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Figure 9.15 

Wiener-Hopf 
filter calculations. 


m(r) + n (t) 


Ho pt (/) 


m(f) + m $) + n c] 1 (f) 
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m(/> 


H op t( /) 
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<J> 
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H«(/)- 1 


m t (0 


(b) 


where S m (/) is the signal PSD at the input of the receiving filter The channel noise power N ch 
appearing at the filter output is given by 


N ch = 



S n (f){H opt (f)\ 2 df 


where S a (f) is the noise PSD appearing at the input of the receiving filter. The distortion 
component acts as a noise. Because the signal and the channel noise are incoherent, the total 
noise N 0 at the receiving filter output is the sum of the channel noise N& and the distortion 
noise N D , 


N 0 = Nch + No 



[|tfop.(/')| 2 Sn(0 + |ffopt(f> - l\ 2 S m (f)] df 


(9.45a) 


Using the fact that + B\ 2 = (A + BjiA* + B*), and noting that both S rr if ) and S Ti (f ! are 
real, we can rearrange Eq. (9,45a) as 


/ OO 

■00 


Hopxif) - 


S m (f) 


S r (f ) 


s r (f) + 


S m (f)S n (f) 

StV) 


df 


(9.45b) 


where S r (f ) = S m (f) + S n (f). The integrand on the right-hand side of Eq. (9,45b) is non- 
negative. Moreover, it is a sum of two nonnegative terms. Hence, to minimize N 0 , we must 
minimize each term. Because the second term S m (f)Sj,(f)/Sr(f ) is independent of H opx (f), 
only the first term can be minimized. From Eq. (9.45b) it is obvious that this term is minimum 
at zero when 


tf op t(0 = 


Smtf) 

Srff) 

S a tf) 

S m (f) + S n (f ) 


For this optimum choice, the output noise power N 0 is given by 



S m (f)S n (f) 


df 


Sr CO 
S m (f)S n (f) 
S m (f)+S n (f) 


df 


(9.46a) 


(9.46b) 
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The optimum filter is known as the Wiener-Hopf filter in the literature- Equation (9.46a) 
shows that & 1 (no attenuation) when S m {f) > S n (f). But when S m (f ) <£ S n (/), 

the filter has high attenuation. In other words, the optimum filter attenuates heavily the band 
where noise is relatively stronger. This causes some signal distortion, but at the same time it 
attenuates the noise more heavily so that the overall SNR is improved 

Comments on the Optimum Filter 

If the SNR at the filter input is reasonably large—for example, S m {f ) > lOOS n (f) (SNR 
of 20 dB)—the optimum filter [Eq. (9.46a)] in this case is practically an ideal filter, and N 0 
[Eq. (9.46b)] is given by 

/ oo 

Sn(f)df 

-00 

Hence for a large input SNR, optimization yields insignificant improvement. The Wiener-Hopf 
filter is therefore practical only when the input SNR is small (large-noise case). 

Another issue is the realizability of the optimum filter in Eq. (9.46a). Because S m (f) 
and S n (f) are both even functions of/, the optimum filter is an even function 

of/. Hence, the unit impulse response ft opt (f) is an even function of t (see Prob. 3.1-1). 
This makes /i op t(0 noncausa! and the filter unrealizable. As noted earlier, such a filter can 
be realized approximately if we are willing to tolerate some delay in the output. If delay 
cannot be tolerated, the derivation of // O pti0 must be repeated with a realizability con¬ 
straint. Note that the realizable optimum filter can never be superior to the unrealizable 
optimum filter [Eq. (9.46a)]. Thus, the filter in Eq. (9.46a) gives the upper bound on per¬ 
formance (output SNR). Discussion of realizable optimum filters can be readily found in the 
literature 1,2 . 



Figure 9.16a shows ft op t(0- It is evident that this is an unrealizable filter. However, a 
delayed version (Fig, 9.16b) of this filter, that is, fr 0 pt(r — fy), is closely realizable if we 
make >3 ffl and eliminate the tail for t < 0 (Fig. 9.16c). 



9.7 APPLICATION: PERFORMANCE ANALYSIS OF 
BASEBAND ANALOG SYSTEMS 

We now apply the concept of power spectral density (PSD) to analyze the performance of 
baseband analog communication systems. In analog signals, the SNR is basic in specifying the 
signal quality. For voice signals, an SNR of 5 to 10 dB at the receiver implies a barely intelligible 
signal. Telephone-quality signals have an SNR of 25 to 35 dB, whereas for television, an SNR 
of 45 to 55 dB is required. 

Figure 9,17 shows a simple communication system in which analog signal m(r) is trans¬ 
mitted at power Sj through a channel (representing a transmission medium). The transmitted 
signal is corrupted by additive channel noise during transmission. The channel also attenuates 
(and may also distort) the signal. At the receiver input, we have a signal mixed with noise. The 
signal and noise powers at the receiver input are 5/ and Ni , respectively. 

The receiver processes (filters) the signal to yield the output s 0 (t) H- %(/)♦ The noise 
component n 0 {t) came from the processing of n(t) by the receiver, while the signal component 
s 0 (t) came from the message m(t). The signal and noise powers at the receiver output are S 0 
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Figure 9.17 

Communication 
system model. 


Channel noise 



and No , respectively. Tn analog systems, the quality of the received signal is determined by 
S 0 /the output SNR. Hence, we shall focus our attention on this figure of merit under either 
a fixed transmission power Sj or for a given Si ♦ 

In baseband systems, the signal is transmitted directly without any modulation. This mode 
of communication is suitable over a pair of twisted wires or coaxial cables. It is mainly used 
in short-haul links. For a baseband system, the transmitter and the receiver are ideal baseband 
filters. The ideal low-pass transmitter limits the input signal spectrum to a given bandwidth, 
whereas the low-pass receiver eliminates the out-of-band noise and other channel interference. 
(More elaborate transmitter and receiver filters can be used, as shown in the next section.) 

The baseband signal m (t) is assumed to be a zero mean, wide-sense stationary random 
process band-limited to B Hz. We consider the case of ideal low-pass (or baseband) filters 
with bandwidth B at the transmitter and the receiver (Fig. 9.17). The channel is assumed to be 
distortionless. The power, or the mean square value, of m (t) is m 2 , given by 


S[ = m 2 


= 2 ( 
Jo 


(9.49) 


For this case, 


S„ = Si 


(9.50a) 


and 


N 0 = lf S n (f)df (9.50b) 

Jo 


where S n (f ) is the PSD of the channel noise. For the case of a white noise, S n (f) = .A/7'2, and 


and 


We define a parameter y as 



So _ S[ 

n~ 0 ~ Mb 


A 

MB 


(9.50c) 


(9.50d) 


(9.51) 


From Eqs. (9.50d) and (9.51) we have 
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Figure 9.18 

Optimum 
preemphasis and 
deemphasis 
filters in 
baseband 
systems. 


Channel noise 



The parameter y is directly proportional to Sj and, therefore, directly proportional to Sj - Hence, 
a given Sj (or Si) implies a given y. Equation (9.52) is precisely the result we are looking for. 
Tt gives the receiver output SNR for a given Sr (or Si). 

The value of the SNR in Eq. (9*52) often serves as a benchmark against which the output 
SNRs of other modulation systems are measured in practice. 


9.8 APPLICATION: OPTIMUM 

PREEMPHASIS-DEEMPHASIS SYSTEMS 

It is possible to increase the output SNR by deliberate distortion of the transmitted signal (pre¬ 
emphasis) and the corresponding compensation (deemphasis) at the receiver. For an intuitive 
understanding of this process, consider a case of white channel noise and a signal m(/) whose 
PSD decreases with frequency. In this case, we can boost the high-frequency components of 
m(0 at the transmitter (preemphasis). Because the signal has relatively less power at high 
frequencies, this preemphasis will require only a small increase in transmitted power* At the 
receiver, the high-frequency components are attenuated (or deemphasized) to undo the preem¬ 
phasis at the transmitter. This will restore the useful signal to its original form. The channel 
noise receives an entirely different treatment. Because the noise is added after the transmitter, 
it does not undergo preemphasis. At the receiver, however, it does undergo deemphasis (i.e., 
attenuation of high-frequency components). Thus, at the receiver output, the signal power is 
restored but the noise power is reduced. The output SNR is therefore increased. 

In this section, we consider a baseband system. The extension of preemphasis and deem¬ 
phasis to modulated systems is straightforward, A baseband system with a preemphasis filter 
H p (f) at the transmitter and the corresponding complementary deemphasis filter H d (f) at the 
receiver is shown in Fig. 9.18. The channel transfer function is H c {f ), and the PSD of the input 
signal m(f) is S m (f ). We shall determine the optimum preemphasis-deemphasis (PDE) filters 
H p (f) and H d (f) required for distortionless transmission of the signal m(r). 

For distortionless transmission, 

\H p (f)H c (f)H d (f)\ = G (a constant) (9,53a) 

and 

W + W + &dif) = (9.53b) 

We want to maximize the output SNR, S 0 fN 0 i for a given transmitted power Sj. 

Referring to Fig. 9.18, we have 


Sr = 



S m (f)\H p (f)\ 2 df 


(9.54a) 


Actually, the transmitted power is maintained constant by attenuating the preemphasized signal slightly. 
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Because Hp(f)H c {f)Hd(f) = Gtxpi-jliiftd), the signal power S 0 at the receiver output is 


/ QO 

■: 

-OC 


S m {f)df 


(9.54b) 


The noise power N 0 at the receiver output is 


v "=£ 


Sn(fmi(f)\ 2 df 


(9.54c) 


^ (9 , 5) 

N 0 f? x S n (f)\H d tf)\ 2 df K ■' } 

We wish to maximize this ratio subject to the condition in Eq. (9.54a) with 5> as a given 
constant. Applying this power limitation makes the design of H p (f) a well-posed problem, 
for otherwise filters with larger gains will always be better. We can include this constraint by 
multiplying the numerator and the denominator of the right-hand side of Eq. (9,55) by the 
left-hand side and the right-hand side, respectively, of Eq. (9.54a). This gives 

So = G*S T fZS m (f)df _ 

N o 1Z (f Md(f )1 2 df S m (/ )\H p (f )\ 1 df 

The numerator of the right-hand side of Eq. (9.56) is fixed and unaffected by the PDE filters. 
Hence, to maximize S 0 /N 0 , we need only minimize the denominator of the right-hand side of 
Eq. (9*56). To do this, we use the Cauchy-Schwarz inequality (Appendix B), 




/ OC 

-oc 


Sn(f)\H d (f)\ l df 




The equality holds if and only if 


S m (f )\H p (f )\ 2 - K 2 S n ff)\H d (f)\ 2 (9.58) 

where K is an arbitrary constant. Thus to maximize S 0 fN 0 , Eq. (9.58) must be satisfied. 
Substitution of Eq. (9.53a) into Eq. (9.58) yields 


j Hp ^l-GK^M}M1 
PVJiopt \H c (f)\ 




gZsmsZT) 


\»S)\ 


(9.59a) 


(9.59b) 


The constant K is found by substituting Eq. (9.59a) into the power constraint of Eq. (9.54a) as 


G fZ x> lV^(f)W)/\Nc(f)\}df 


(9.59c) 
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Substitution of this value of K into Eqs, (9.59a) and (9.59b) yields 


\H p (f)\ 2 opi 


_ SrVS n (f)/S m (f) _ 

\»c(f )I OVSnCOW)/ \fic(f)\]df 

G 2 J^S m (f)S n (f)/\H c (f)\\df 
ST\H c <f)\JsJf)jsJf) 


(9.60a) 

(9,60b) 


The output SNR under optimum conditions is given by Eq. (9.56) with its denominator replaced 
by the right-hand side of Eq t (9.57). Finally, substituting \H p (f)H d (f)\ = G/\H c ij ) \ leads to 



$ T fZ,Sn(f)df 

(/! L [^ m "( f ) 5 n ( f )/|// c (/)|]#) 2 


(9.60c) 


Equations (9.60a) and (9.60b) give the magnitudes of the optimum filters H p (f) and H d (f). 
The phase functions must be chosen to satisfy the condition of distortionless transmission 
[Eq. (9.53b)]. 

Observe that the preemphasis filter in Eq. (9.59a) boosts frequency components where 
the signal is weak and suppresses frequency components where the signal is strong. The 
deemphasis filter in Eq. (9.59b) does exactly the opposite. Thus, the signal is unchanged but 
the noise is reduced. 


Example 9.12 Consider the case with a = 1 40(It . 


S m (f ) 


The channel noise is white with PSD 


= y (9.61b) 

The channel is assumed to be ideal [H c (f) = 1 and G = 1] over the band of interest 
(0-4000 Hz). 


( 2 */)W 


0 


[/I <4000 
| >4000 


(9.61a) 


Without preemphasis-deemphasis, we have 

r4fXX> 


% = f S m (f)df 
J- 4000 

/>40uo r 

= 2 

Jo 


0 (Inf) 2 + a 2 
10" 4 C 


df 


a — 1400;r 


I Also, because G = 1 , the transmitted power S 7 = S 0 , 

s 0 = S T = 10 _4 C 
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and the noise power without preemphasis-deemphasis is 

N 0 = NB = 4000^ 

Therefore, 

^ = 2.5x10-^ 
N 0 A' 


(9.62) 


The optimum transmitting and receiving filters are given [Eqs. (9*60a) and (9.60b)] by 

Wit - , l If, <4000 


IZo ( ! /\/(W) 2 + « 2 ) df 


(9.63a) 


Mi (Of = 


104 /^O (W(W) 2 + « 2 ) 0.778 x 10 4 


yJ(2jrf) 2 + a 2 


V(2 7if) z + a 2 


]f\ < 4000 


(9,63b) 

The output SNR using optimum preemphasis and deemphasis in found from Eq. (9.60c) 


as 


(IL = 


(1CT 4 C) 2 


(,pt (A"C/2) [f 4 _Z> [W4jt 2 / 2 + (1400t)2] #]' 

C 


3.3 x ur 8 


Af 


(9.64) 


Comparison of Eq* (9,62) with Eq* (9.64) shows that preemphasis-deemphasis has 
increased the output SNR by a factor of 132, 


9.9 BANDPASS RANDOM PROCESSES 

If the PSD of a random process is confined to a certain passband (Fig. 9,19), the process is 
a bandpass random process* Bandpass random processes can be used effectively to model 
modulated communication signals and bandpass noises* Just as a bandpass signal can be 
represented in terms of quadrature components [see Eq. (3.39)], we can express a bandpass 
random process x(f) in terms of quadrature components as follows: 

x(0 = x c (t) cos co c t + Xj(f) sin Q) c t (9*65) 


In this representation, x^(0 is known as the in-phase component and x^f) is known as the 
quadrature component of the bandpass random process. 


Figure 9.19 

PSD of a 
bandpass 
random process. 
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This can be proven by considering the system in Fig. 9.20a, where Mq(/) is an idea] 
low-pass filter (Fig. 9.20b) with unit impulse response MO - First we show that the system 
in Fig. 9.20a is an ideal bandpass filter with the transfer function H(f) shown in Fig. 9 + 20c. 
This can be conveniently done by computing the response h(t) to the unit impulse input 8{t). 
Because the system contains time-varying multipliers, however, we must also test whether it 
is a time-varying or a time-invariant system. Tt is therefore appropriate to consider the system 
response to an input 8(t — a). This is an impulse at t = a. Using the fact that [see Eq. (2.10b)] 
-a) =f{a)8(t — a), we can express the signals at various points as follows: 


Signal at 


a\ : cos + 9)8{t — a) 

&2 : sin {o) c a + 0)<5(f — a) 

b\ : cos (co c a + 9)h^(t — a) 

f ?2 : sin (a> c a + 6)ho(t — a) 

c i : cos (to c (x H- 9) cos {a) c t + — n) 

C 2 : sin {oj c ql + 9) sin (a> c t 4- 9)ho(t — a) 

d : ho(t - a) [cos (cu c a + 9) cos (co c t + 9) + sin (o\a + 9) sin (co c t + 0)] 

~ 2ho(t — or) cos — a)] 


Figure 9*20 

(a) Equivalent 
circuit of an 
ideal bandpass 
filter, (b) Ideal 
low-pass filter 
frequency 
response. 

(c) Ideal 
bandpass filter 
frequency 
response. 





9.9 Bandpass Random Processes 493 


Thus, the system response to the input S(t — a) is 2hy{t — a)cos [&> t U — a}], Clearly, this 
means that the underlying system is linear time invariant, with impulse response 

hit) — Ihqit) cos co c t 


and transfer function 

//(/)= H Q (f+f c )+H 0 (f-f c ) 

The transfer function H if ) (Fig* 9*2Qc) represents an ideal bandpass filter. 

If we apply the bandpass process x(r) (Fig* 9.19) to the input of this system, the output 
y(0 at d will remain the same as x(f). Hence, the output PSD will be the same as the input 
PSD 

\H(f)\ 2 S x (f) =S x (f) 

If the processes at points b\ and (low-pass filter outputs) are denoted by \ c (t) and x s (t), 
respectively, then the output x(f) can be written as 

x(r) = x c (0 cos (a> c t -b 9) + Xj(r) sin ia> c t + 0) (9*66) 

where x c (t) and x s (t) are low-pass random processes band-limited to B Hz (because they are 
the outputs of low-pass filters of bandwidth B). Because Eq* (9.66) is valid for any value of 
by substituting 9 = 0, we get the desired representation in Eq. (9*65)* 

To characterize x c (f) and x 5 (f), consider once again Fig* 9.20a with the input x(0- Eet 
9 be an RV uniformly distributed over the range (0, 2 jt), that is, for a sample function, 0 is 
equally likely to take on any value in the range (0, 2jr)* In this case x(t) is represented as in 
Eq. (9.66). We observe that x c (t) is obtained by multiplying x(t) by 2cos (a> c t H- 9), and then 
passing the result through a low-pass filter* The PSD of 2x(/)cos (a^r-b#) is [see Eq. (9.22b)] 

4x +/ e )+S,(/--/ c )] 

This PSD is S x if) shifted up and down by f c , as shown in Fig. 9.21a. When this is passed 
through a low-pass filter, the resulting PSD of x t (/) is as shown in Fig. 9.21b. It is clear that 


S Xc (f) 


S x (f+fc)+S x (f-f £ ) \f\<B 
0 \f\ >B 


(9,67a) 


Figure 9.21 

Derivation of 
PSDs of 
quadrature 
components of a 
bandpass 
random process. 
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We can obtain S Xs (f ) in the same way. As far as the PSD is concerned, multiplication by 
cos (o) c t + $) or sin (co c t 4- $) makes no difference [see footnote following Eq, (9.22a)], and 
we get 


S* e (f)=S Xt tf) = 


Sxt/'+ZrJ + SxOWc), 

0 , 


\f\<B 
{f\>B 


(9.67b) 


From Figs. 9.19 and 9.21b, we make the interesting observation that the areas under the PSDs 
S x (f) t S Xc (/X and S Xl .(f) are equal. Hence, it follows that 


x 2 (f) = x 2 (t) = x 2 (f) (9.67c) 

Thus, the mean square values (or powers) of x^fr) and x^(r) are identical to that of x(0- 

These results are derived by assuming 0 to be an RV. For the representation in Eq. (9.65), 
0=0, and Eqs. (9.67b) and (9.67c) may not be true. Fortunately, those equations hold even 
for the case of 0 = 0. The proof is rather long and cumbersome and will not be given here. 1 ^ 3 
It can also be shown 1-3 that 


x c(0\y(f) = J?x,.x 4 .(0) = 0 (9.68) 

That is, the amplitudes x c . and x* at any given instant are uncorrelated. Moreover, if S x (f) is 
symmetrical about a> c (as well as — then 

R^(r) = 0 (9.69) 


Example 9.13 The PSD of a bandpass white noise n(/) is Af /2 (Fig. 9.22a). Represent this process in terms 
of quadrature components. Derive S nc (f) and S ns {f) t and verify that n 2 = n 2 = n 2 . 


Figure 9.22 

(a] PSD of a 
bandpass white 
noise process. 

(b) PSD of its 
quadrature 
components. 
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(b) 


1 We have the expression 


n(0 = n c (r)cos co c t + n s {() sin w c t 


where 


S* c (f) = S ns (f) = 


S n (f+fc)+S n (f-f u ) \f\<B 
0 \f\>B 


(9.70) 
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It follows from this equation and from Fig. 9.22 that 


S* c tf)=S a ,tf) = 


M 

0 


\f\<B 

\f\>B 


Also, 


(971) 


n 2 = 2 / —df = 2A r B (9.72a) 

Jfc-B 2 


From Fig. 9.22b it follows that 


Hence, 


— — r 

n 2 = n 2 = 2 / M df = 1MB 
Jo 


(972b) 


n 2 = n 2 = n 2 = 2MB 


(972c) 


Nonuniqueness of the Quadrature Representation 

No unique center frequency exists for a bandpass signal. For the spectrum in Fig. 9.23a, for 
example, we may consider the spectrum to have a bandwidth 2 B centered at o) c . The same 
spectrum can be considered to have a bandwidth 2 B f centered at aq, as also shown in Fig. 9.23a. 
The quadrature representation [Eq. (9.65)] is also possible for center frequency (o \: 

x(r) = x C| (f)cos &>if-j-x ?i (f) sin 


where 


(f) = if) 


0 \f\ > B f 


(973) 


Figure 9.23 

Nonunique 
nature of 
quadrature 
component 
representation of 
a bandpass 
process. 
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This is shown in Fig. 9.23b. Thus, the quadrature representation of a bandpass process 
is not unique. An infinite number of possible choices exist for the center frequency, and 

corresponding to each center frequency Is a distinct quadrature representation. 


Example 9,14 A bandpass white noise PSD of an SSB channel (lower sideband) is shown in Fig. 9.24a. 

Represent this signal in terms of quadrature components with the carrier frequency co r . 

| The true center frequency of this PSD is not o) c ; hut we can still use « t . as the center 
i| frequency, as discussed earlier, 

U\ 

§ n(/) = n r (f) coster + n s (t) slnoj c t (9.74) 


The PSD S nc if ) or S nx (f) obtained by shifting S n (f) up and down by/ c [see Eq. (9.73)] is 
shown in Fig t 9,24b, 



Figure 9.24 

A possible form 
of quadrature 
component 
representation of 
noise in SSB. 
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Bandpass “White” Gaussian Random Process 

Thus far we have avoided defining a Gaussian random process. The Gaussian random process 
is perhaps the single most important random process in the area of communication. A careful 
and unhurried discussion, however, is beyond our scope. AH we need to know r here is that an 
RV x(f) formed by sample function amplitudes at instant t of a Gaussian process is Gaussian, 
with a PDF of the form of Eq. (8.39). 

A Gaussian random process with a uniform PSD is called a white Gaussian random process. 
The term bandpass “white'* Gaussian process is actually a misnomer. However, it is a popular 
notion to represent a random process n(f) with uniform PSD Jsf/2 centered at a> c and with a 
bandwidth IB (Fig. 9.22a). Utilizing the quadrature representation, it can he expressed as 

n(f) = n c (t) cos o) c t + n*(f) sin to c t (9.77) 


where, from Eq. (9*71), we have 


S n Jf)=S Ui (f)^ 


A' [f\ < B 
0 \f \>B 


Also, from Eq. (9.72c), 


n 2 = n 2 = n 2 = 2,VZ> 


The bandpass signal can also be expressed in polar form [see Eq. (3.40)]: 


(9.78) 


n(f) = E(;) cos (co c t -F 0) (9.79a) 

where the random envelope and random phase are defined by 

E(f) = yn^r) + njit) (9.79b) 

0(f) = -tan -1 (9.79c) 

MO 

The RVs iv(0 and n*(r) are uncorrelated [see Eq. (9.68)] Gaussian RVs with zero means and 
variance 2A r B [Eq. (9.78)]. Hence, their PDFs are identical: 

Ai,(«) = Ai,(«) = —(9.80a) 


where 


a 2 = 2MB (9.80b) 

It has been shown in Prob. 8.2-10 that if two Gaussian RVs are uncorrelated, they are 
independent. In such a case, as shown in Example 8.17, E(f) has a Rayleigh density 

PE (E) = e~ E2/2ct2 u(E), a 2 = Z\ r B (9.81) 

( 7 “ 

and 0 in Eq. (9.79a) is uniformly distributed over (0, 2 jt). 
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Figure 9.25 

Pha$or 

representation of 
a sinusoid and a 
narrowband 
Gaussian noise. 



Sinusoidal Signal in Noise 

Another case of interest is a sinusoid plus a narrowband Gaussian noise. If A cos (co c t + <p) 
is a sinusoid mixed with n(f), a Gaussian bandpass noise centered at &> c , then the sum y(f) is 
given by 

y(r) = A cos ( a> c t + <p) + n(r) 

Using Eq. (9.66) to represent the bandpass noise, we have 

y( f ) = [A + n t (f)] cos (<y c r + <p) + n.,(f) sin (co c t + <p) (9.82a) 

= E(r) cos [o> c t + 0(f) + <p] (9.82b) 

where E(f) is the envelope fE(f) > 0] and 0(f) is the angle shown in Fig. 9.25, 

E(0 = yf[A + n r (f)] 2 + n 2 0) (9.83a) 

0 (f) = — tan“' (9.83b) 

A + tic(f) 

Both n c (f) and n : (r) are Gaussian, with variance a 2 . For white Gaussian noise, o~ = 2.\ B 
[Eq. (9.80b)]. Arguing in a manner analogous to that used in deriving Eq. (8.57), and observing 
that 

n^ + nj = E 2 - A 2 - 2Anc 

= E 2 - 2A(A + n c ) +A 2 
= E 2 ~2AE cos 0(f) + A 2 


we have 


(E, $) = JL- e -(£ 2 -2AEcote+Al)/2a 2 (9.84) 

where cr 2 is the variance of n c (or n s ) and is equal to 2AfB for white noise. From Eq. (9.84) 
we have 


Pe(E)= [* Pe b (E, 9) d$ 

J — 7T 

= ^_ e -(E 2 +A 2 )/2a 2 [ 1 f W e {AE/<r 2 ) cosS ^1 

o 2 \_2jz J 


(9.85) 


The bracketed term on the right-hand side of Eq. (9.85) defines Io(AE/a 2 ) t where Iq is the 
modified zero-order Bessel function of the first kind. Thus, 


Pl (E) = * (if) 


(9.86a) 
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Figure 9.26 

Rican PDF, 



This is known as the Rice density, or Ricean density. For a large sinusoidal signal (A a), 
it can be shown that 4 


/o 



and 




/ 


2 ttAct 2 


,-(E-A) 2 / 2v 2 


(9.86b) 


Because A E ^ A, and pe(E) in Eq. (9.86b) is very nearly a Gaussian density with mean 
A and variance <r. 


P e (E) ~ — l _ e -(E-A) 2 /2o- 2 (9,86c) 

av 2tz 

Figure 9.26 shows the PDF of the normalized RV E/a. Note that for A/a =0, we obtain the 
Rayleigh density. 

From the joint PDF p E ®{E,G), we can also obtain p e (G), the PDF of the phase 0, by 
integrating the joint PDF with respect to E t 

Pe (0)= f*/*«(£, 9)dE 
Jo 

integration is straightforward, there are a number of involved steps, and for this 
not be repeated here. The final result is 

± e -AV2. 2 jj + d^cos[l (9.86d) 
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PROBLEMS 


9.1- 1 (a) Sketch the ensemble of the random process 

x(0 = acos (aj c t + 0) 

where (o c and © are constants and a is an RV uniformly distributed in the range (—A, A). 

(b) Just by observing the ensemble, determine whether this is a stationary or a nonstationary 
process. Give your reasons. 

9.1- 2 Repeat part (a) of Prob. 9.M if a and 0 are constants but oj c is an RV uniformly distributed in 

the range (0, 100). 

9*1-3 (a) Sketch the ensemble of the random process 

x(f) = a t b 

where b is a constant and a is an RV uniformly distributed in the range (-2, 2). 

(b) Just by observing the ensemble, state whether this is a stationary or a nonstationary process. 

9*1-4 Determine x(r) and 1 2 ) for the random process in Prob. 9.1-1, and determine whether 

this is a wide-sense stationary process. 

9.1- 5 Repeat Prob, 9.1-4 for the process x(f) in Prob. 9.1-2. 

9.1- 6 Repeat Prob. 9.1-4 for the process x(t) in Prob, 9.1-3. 

9*1-7 Given a random process x{r) = kf, where k is an RVuniformly distributed in the range (—1, 1). 

(a) Sketch the ensemble of this process. 

(b) Determine x(r). 

(c) Determine * 2 ) « 

(d) Is the process wide-sense stationary? 

(e) Is the process ergodic? 

(f) If the process is wide-sense stationary, what is its power P s [that is, its mean square value 
x 2 (r)? 

9.1- 8 Repeat Prob, 9,1-7 for the random process 

x(r) = acos (<o c t + 0) 


where <o c is a constant and a and 0 are independent RVs uniformly distributed in the ranges 
(-1, l)and(0, 27i ), respectively. 
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9,2-1 For each of the following functions, state whether it can he a valid PSD of a real random process. 


(a) 

(2 nf) 2 

(e) 

S[2n(f +f 0 )] - S[2n(f 

(Inf) 2 + 16 

(b) 

1 

(f) 

jmf+f(i) + Hf-m 

( 2 nf) 2 - 16 

(c) 

(271 f) 

( 8 ) 

jO-nf) 2 

(Inf) 2 + 16 

(2 nf) 2 + 16 

(d) 

5(2 nf) + - \ - 

(Inf) 2 + 16 




9.2-2 Show that for a wide-sense stationary process x(f), 

(a) **<0)>|J? x (r)| 

Hint: (jq =b X 2) 2 = x 2 + x 2 =b 2x^2 > 0. Let xj = x(fi) and X 2 = xfe). 

(b) lim Rx(t) = x 2 

T— J-OO 

Hint: As z oc, X[ and X 2 tend to become independent. 


9.2- 3 Show that if the PSD of a random process x(/) is band-limited, and if 

/ n \ l n = 0 

' r \2B) ~ jo n = ±1, ±2, ±3, ... 

then the minimum bandwidth process x(f) that can exhibit this autocorrelation function is a 
white band-limited process; that is, 5 x tf) = k U(f/2W), 

Hint: Use the sampling theorem to reconstruct R x (r), 

9.2- 4 For the random binary process in Example 9.5 (Fig. 9.9a), determine R^(j) and S\if) if the 

probability of transition (from 1 to -1 or vice versa) at each node is p instead of 0.5. 

9.2- 5 A wide-sense stationary white process m(0 band-limited to B Hz is sampled at the Nyquistrate. 

Each sample is transmitted by a basic pulse p{t) multiplied by the sample value. This is a PAM 
signal. Show that the PSD of the PAM signal is 2R/? m (0)|P(/)| 2 . 

Hint: Use Eq. (9.31 )♦ Show that Nyquist samples and a£ +;j (n > 1) are uncorrelated. 

9.2- 6 A duobinary line code proposed by Lender is a ternary scheme similar to bipolar that requires 

only half the bandwidth of the latter. In this code, 0 is transmitted by no pulse, and 1 is transmitted 
by pulse p(t) or — p(t) using the following rule: A 1 is encoded by the same pulse as that used to 
encode the preceding 1 if the two Is are separated by an even number of Os. It is encoded by the 
negative of the pulse used to encode the preceding 1 if the two Is are separated by an odd number 
of Os. Random binary digits are transmitted every T ^ seconds. Assuming P(0) = P(l) = 0.5, 
show that 

c ino 1 2 2, 

S y (f) = —— cos' 1 (nfT b ) 

*b 

Find Sy(f) if p(t), the basic pulse used, is a half-width rectangular pulse Fl(2r/T^). 

9.2- 7 Determine S y (f ) for polar signaling if ^(1) = Q and F(0) = 1 — Q. 

9.2- 8 An impulse noise x(r) can be modeled by a sequence of unit impulses located at random instants 

(Fig. P9.2-8), There are an average of a impulses per second, and the location of any impulse is 
independent of the locations of other impulses. Show that R x (j) = a <5(r) + a 2 t 
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Figure P.9,2-8 


4 



9.2-9 Repeat Prob. 9,2-8 if the impulses arc equally likely to be positive and negative. 

9.2-10 A sample function of a random processx(f) is shown in Fig. P9.2-10. The signal x(0 changes 
abruptly in amplitude at random instants. There are an average of fi amplitude changes (or 
shifts) per second. The probability that there will be no amplitude shift in t seconds is given by 
Pq(t) = e~& T , The amplitude after a shift is independent of the amplitude before the shift. The 
amplitudes are randomly distributed, with a PDFp x (jt) r Show that 


R\(r)-\ 2 e ^ |r| and S x (f) = 


Ifix 2 


P 2 + <2jt/)2 


This process represents a model for thermal noise. 1 


Figure 

P.9.2-10 



9.3-1 Show that for jointly wide-sense stationary, real, random processes x(f) and y(0* 

IfixyMI ^ L**(0)Ry(0)] !/2 


Hint: For any real number a, (ax - y) 2 > 0. 

9.3- 2 If x(r) and y(f) are two incoherent random processes, and two new processes u(r) and v(r) are 

formed as follows; 

u(r) = 2x(r) - y(f) v(r) = x(r) + 3y (t) 
hnd /? u (r) s R v ( r), R uv (t), andfl vu (T) in terms of /? x (r) and R y ( r). 

9.3- 3 Two random processes x(r) and y(f) are 

x(r) = A cos (coQt + tp) and y(f) = B sin (ncoQt + n<p + i?) 

where n — integer ^ 0 and A, B, ij /, and are constants and tp is an RV uniformly distributed 

in the range (0, In). Show that the two processes are incoherent. 

9.3- 4 A sample signal is a periodic random process x(0 shown in Fig. P9.3-4. The initial delay b where 

the first pulse begins is an RV uniformly distributed in the range (0, 7^), 

(a) Show that the sample signal can be written as 

00 

X(0 = Co + C„ cos [fttoo(i - b) + e„] 

n = 1 
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by first finding its trigonometric Fourier series when b ; 
(b) Show that 

1 00 

Rx(j) ~ Cq + - C^coanajQT 
n=J 


a. 


= 


2jt 

n 



9.4-1 A simple RC circuit has two resistors and R 2 in parallel (Fig, P9,4-la), Calculate the rms 
value of the thermal noise voltage v 0 across the capacitor in two ways: 

(a) Consider resistors /?i and R 2 as two separate resistors, with respective thermal noise voltages 
of PSD 2kTR\ and 2Jt77?2 (Fig- P9.4-lb). Note that the two sources are independent, 

(b) Consider the parallel combination of/?i and #2 as a single resistor of value R\R 2 /(R\ +^ 2 ), 
with its thermal-noise voltage source of PSD 2kTR [ ^ 2/(^1 + # 2 ) (Fig- P9.4-lc). Comment, 


Figure P.9.4-1 



(a) 


V. 




9*4-2 Show thattfxy (r), the cross-correlation function of the input process x{f) and the output process 
y(f) in Fig. 9,12, is 

Rxy{ t) = A(r) */t x (r) and S xy (f) = H(f)S x (f) 

Hence, show that for the thermal noise n(/) and the output v fl (f) in Fig. 9.13 (Example 9.9), 
2kTR 2 kT _ r tj>r 

S n ,M) = i-jte j RC and R "^=~^ e w(r) 

9*4-3 A shot noise is similar to impulse noise described in Prob. 9.2-8 except that instead of random 
impulses, we have pulses of finite width. If we replace each impulse in Fig, P9,2-8 by a pulse 
h(t) whose width is large in comparison to \/a, so that there is a considerable overlapping of 
pulses, we get shot noise. The result of pulse overlapping is that the signal looks like a continuous 
random signal, as shown in Fig. P9.4-3. 

(a) Derive the autocorrelation function and the PSD of such a random process. 

Hint: Shot noise results from passing impulse noise through a suitable filter. First derive 
the PSD of the shot noise and then obtain the autocorrelation function from the PSD, The 
answers will be in terms of a , h(t), or H(f). 
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Figure P.9.4-3 


(b) The shot noise in transistors can be modeled by 

= !<r f/T u(o 

where q is the charge on an electron and T is ihe electron transit time. Determine and sketch 
the autocorrelation function and the PSD of the transistor shot noise. 



9*6-1 A signal process m(f) is mixed with a channel noise n(/). The respective PSDs are 


SmlO = 


6 

9 + { 2 ; r /)2 


and 


Sn(f) = 6 


(a) Find the optimum Wiener-Hopf filter. 

(b) Sketch its unit impulse response. 

(c) Estimate the amount of delay necessary to make this filter closely realizable (causal), 

(d) Compute the noise power at the input and the output of the filter. 

9.6-2 Repeat Prob. 9.6-1 if 


5m if) 


4 

4+ <27r/)Z 


and 


5 „(/) 


32 

64 + (2 nf) 1 


9.7-1 A message signal m(0 with 


Smif) = 


(2jt/) 2 + a 2 


(a = 3000jt) 


DSB-SC modulates a carrier of 100 kHz. Assume an ideal channel with H c (f) = 10’ 3 ] and 
the channel noise PSD S n (f) = 2 x 10 -9 . The transmitted power is required to be 1 kW, and 
G= 10“ 2 1. 

(a) Determine transfer functions of optimum preemphasis and deemphasis filters. 

(b) Determine the output signal power, the noise power, and the output SNR. 

(c) Determine y at the demodulator input. 

9,7-2 Repeat Prob. 9.7-1 for the SSB (USB) case. 

9*7-3 It was shown in the text that when the baseband m(/) is band-limited with a uniform PSD, 
PM and FM have identical performance from the SNR point of view. For such m(f) T show that 
optimum PDE filters in angle modulation can improve the output SNR by a factor of 4/3 (or L3 
dB) only. Find the optimum PDE filter transfer functions. 
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Figure P.9.8-1 


Figure P.9.8-3 


9.8-1 A white process of PSD /2 is transmitted through a bandpass filter H(f ) (Fig, P9,8-l). 
Represent the filter output n(r) in terms of quadrature components, and determine S nc (f),S n% (/), 

n|, Rj > and n 2 when the center frequency used in this representation is 100 kHz (he., f c ~ 
100 x 10 3 ). 



9.8- 2 Repeat Prob. 9.8-1 if the center frequency f c used in the representation is not a true center 

frequency. Consider three cases: (ai)f c = 105 kHz; (I b)f c = 95 kHz; (c)/^ = 120 kHz. 

9.8- 3 A random process x(/) with the PSD shown in Fig, P9.8-3a is passed through a bandpass filter 

(Fig, P9.8-3b). Determine the PSDs and mean square values of the quadrature components of 
the output process. Assume the center frequency in the representation to be 0,5 MHz. 








—-j 100 kHz 

_EL 

j 

11 

o| 

0.5 MHz /—*- 


(b) 










r\ PERFORMANCE ANALYSIS OF 
W DIGITAL COMMUNICATION 
SYSTEMS 


I n analog communications, the user objective is to achieve high fidelity in waveform repro¬ 
duction; Hence, the suitable performance criterion is the output signal-to-noise ratio. The 
choice of this criterion indicates that the signal-to-noise ratio reflects the quality of the 
message and is related to the ability of a listener to interpret a message. 

In digital communication systems, the transmitter input is chosen from a finite set of 
possible symbols. The objective at the receiver is not to reproduce the waveform that carries 
the symbol with fidelity; instead, the receiver aims to accurately determine which particular 
symbol was transmitted among the set of possible ones. Because each symbol is represented by 
a particular waveform at the transmitter, our goal is to decide, from the noisy received signal, 
which particular waveform was originally transmitted. Logically, the appropriate figure of 
merit in a digital communication system is the probability of error in this decision at the 
receiver. In particular, the probability of bit error, also known as the bit error rate (BER), is a 
direct quality measure of the communication system. Not only is the BER important to digital 
signal sources, it is also directly related to the quality of signal reproduction for analog signal 
sources* 

In this chapter, we present two important aspects in the performance analysis of digital 
communication systems. The first part focuses on the error analysis of several specific binary 
detection receivers. The goal is for students to learn how to apply the fundamental tools of 
probability theory and random processes for BER performance analysis. Our second focus is to 
illustrate detailed derivation of optimum detection receivers for general digital communication 
systems such that the receiver BER can be minimized. 


10.1 OPTIMUM LINEAR DETECTOR FOR BINARY 
POLAR SIGNALING 

In binary communication systems, the information is transmitted as 0 or 1 in each time interval 
T 0 . To begin, we consider the binary polar signaling system of Fig. 10.1a, in which the source 
signal bit 1 and 0 are represented by respectively. Having passed a distortionless, but 
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Figure 10,1 

Typical binary 
polar signaling 
and linear 
receiver 


±p( 0 + «(') 


±Pv (0 + n o (0 t=.t m , 

Threshold 

Decision 


hit) 

\ ! 

device 

■-- 


(a) 




noisy, channel, the received signal waveform is 

y(t) = ±/j( 0 + n(r) 0 < r < 7b 


( 10 . 1 ) 


where n(f) is a Gaussian channel noise. 


10.1.1 Binary Threshold Detection 

Given the received waveform of Eq, (10.1), the binary receiver must decide whether the 
transmission was originally a 1 or a 0. Thus, the received signal y(0 must be processed to 
produce a decision variable for each symbol. The linear receiver for binary signaling, as 
shown in Fig. 10.1a T has a general architecture that can be optimum (to be shown later in 
Section 10.6)* Given the receiver filter //(f) or h{t) t its output signal for 0 < f < Iq is simply 

y(t) = ± p(t) * h(t) + n(t) * h(t) = ±p 0 (t) + n„(r) (10.2) 

Po(t) M0 

The decision variable of this linear binary receiver is the sample of the receiver filter output 
at t — t m : 


r(t m ) = ±p 0 (t m ) + n 0 {t m ) (10*3) 

Based on the properties of Gaussian variables in Section 8*6, 

na(f) = f n(z)h(t - r)dr 

Jo 

is Gaussian with zero mean so long as n(f) is a zero mean Gaussian noise* If we define 


Ap — PoUm) 
al = £{n 0 (f m ) 2 } 


(10.4a) 

(10.4b) 


then this binary detection problem is exactly the same as the threshold detection of 
Example 8,16, We have shown in Example 8.16 that, if the binary data are equally likely 
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to be 0 or 1, then the optimum threshold detection is 


dee{r(f m )} = 


1 if r(t m ) > 0 
0 if r(t m ) < 0 


(10.5a) 


whereas the probability of (bit) error is 


Pc = Q(p) 


(10.5b) 


in which 


(10.5c) 


To minimize P e , we need to maximize p because Q(p) decreases monotonically with p. 

10.1.2 Optimum Receiver Filter—Matched Filter 

Let the received pulse p(t) be time-limited to T a (Fig. 10 A). We shall keep the discussion as 
general as possible at this point. To minimize the BER or P ey we should determine the best 
receiver filter H(f ) and the corresponding sampling instant t m such that Q{f>) is minimized. 
In other words, we seek a filter with a transfer function H(f) that maximizes 


which is coincidentally also the signal-to-noise ratio at time instant t — t m . 

First, denote the Fourier transform of p(t) as P(f ) and the PSD of the channel noise n(r) 
as S n (f )- We will determine the optimum receiver filter in the frequency domain. Starting with 

Po(t) = F~ l [P(f)H(f)) 


/ OO 

P(f)H(f)eiW df 

-OO 


we have the sample value at t = t„ 


Po (tm) 


/ OO 

P(f)H{f)ei 2 *t 

-ct- 


On the other hand, the filtered noise has zero mean 


M0 = f n 
Jo 


(z)h(t — z)dz = I n(z)h(t — z)dz = 0 


while its variance is given by 


7 l = 


S n <f)\H(f)\ 2 df 
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Hence, the signal-to-noise ratio is given in the frequency domain as 

9 /^Sntoifftni 2 # 


(10.9) 


The Cauchy-Schwarz inequality (Appendix B) is a very powerful tool for finding the 
optimum filter H(f). We can simply identify 


X(f) = H(f)y/s^f) Y(f) = 


Pif)e> l7T f t "' 


ySnCO 

Then by applying the Cauchy-Schwarz inequality to the numerator of Eq. (10.9), we have 

2 _ I I^X<f)Y(f)df\ 2 

f?L\X(f)\ 2 df-f™ \Yif)\ 2 df 


/ 

/ 


f^\X(f)\ 2 df 
\y<f)\ 2 df 
\P(f )\ 2 


-00 Saif) 

with equality if and only if X(f) = fc[T(f)]* or 


df 


(10.10a) 




P(f)ei 2 ^ 

~7m~ 


Hence, the SNR is maximized if and only if 


kP(-f)e-i 27l fi’» 

y'W) 


Hif) ~k 


s^T) 


(10.10b) 


where k is an arbitrary constant. This optimum receiver filter is known as the matched filter. 
This optimum result states that the best filter at the binary linear receiver depends on several 
important factors: (1) the noise PSD Saif), (2) the sampling instant t m , and (3) the pulse shape 
P(f). It is independent of the gain at the receiver k. since the same gain would apply to both 
the signal and the noise without affecting the SNR. 

For white channel noise S n (f ) = V/2, Eq. (10.10a) becomes 


_2_f e 

" W-< 


\P(f)\ 2 df= ^ 


(10.11a) 


where E p is the energy of p(t ), and the matched filter is simply 


Hif) = k , P(-f)e- j27lf ‘' 


(10.11b) 


where k' = Ikjff is an arbitrary constant. 
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Figure 10.2 p ( t ) 

Optimum choice 
for sampling 
instant. 



The unit impulse response h(t) of the optimum filter is obtained from the inverse Fourier 
transform 


MO = P(-f)e- j271 ^] 

Note that p (— t) <=>■ P(—f) and represents the time delay of t m seconds. Hence, 

M0 = *>(f*-0 (10.11c) 

The response p{t m — t) is the signal pulse p (— t) delayed by t m . Three cases, t m < T 0y t m = T 0y 
and t m > T 0t are shown in Fig- 10*2. The first case, t m < T 0 , yields a noncausal impulse 
response, which is unrealizable** Although the other two cases yield physically realizable 
filters, the last case, t m > T 0y delays the decision-making instant t m unnecessarily. The case 


* The filter unrealizability can be readily understood intuitively when the decision-making instant is t m < T 0 . In this 
case, we ace forced to make a decision before the full pulse has been fed to the filter {t m < T 0 ). This calls for a 
prophetic filter, which can respond to inputs before they are applied. As we know, only unrealizable (noncausal) 
filters can do this job. 
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= T 0 gives the minimum delay for decision making using a realizable filter. In our future 
discussion, we shall assume t m = T 0 , unless otherwise specified. 

Observe that both p(t) and h{i) have a width of T 0 seconds. Hence, p 0 {t), which is a 
convolution of p(t) and hit), has a width of 2 T 0 seconds, with its peak occurring at t = T 0 where 
the decision sample is taken. Also, because PJf) = = k f \P(f)\ 2 e~i ljzfr \ p„(t) is 

symmetrical about t = T 0 * 

Since the gain k! does not affect the SNR p y we choose k f = L This gives the matched 
filter under white noise 


h(t) = p (T 0 — t) (10.12a) 

or equivalently 

H(f) = P{-f)e~^ a (10.12b) 

for which the signal to noise ratio is maximum at the decision-making instant t — T 0 > 

The matched filter is optimum in the sense that it maximizes the signal-to-noise ratio 
at the decision-making instant. Although it is reasonable to assume that maximization of this 
particular signal-to-noise ratio will minimize the detection error probability, we have not proven 
that the original structure of linear receiver with threshold detection (sample and decide) is the 
optimum structure. The optimality of the matched filter receiver under white Gaussian noise 
will be shown later (Section 10.6). 

Given the matched filter under white Gaussian noise, the matched filter receiver leads to 
Pmax ofEq. (10*1 la) as well as the minimum BER of 


P e — 0(Anax) — Q 



(10.13) 


Equation (10.13) is quite remarkable. It shows that, as far as the system performance is con¬ 
cerned, when the matched filter receiver is used, various waveforms used for p(j) are equivalent 
as long as they have the same energy 


=/: 


\P(f)\ 2 df 


fT° 

Jo 


\pmdt 


The matched filter may also be implemented by the alternative arrangement shown in 
Fig. 103. If the input to the matched filter is y(f), then the output r(f) is given by 


r(0 


/ oo 

y{x)h{t-x)dx 

-DC 


(10.14) 


where h(t) =p(T 0 — t) and 

h(t-x)=p[T 0 -(t-x)]=p(x + T 0 -t) (10.15) 


* This follows from the fact that because \P(f )| 2 is an even function of/, its inverse transform is symmetrical about 
r — 0 (see Prob. 3.1-1). The output from the previous input pulse terminates and has a zero value at t = T 0 . 
Similarly, the output from the following pulse starts and has a zero value at t = T 0 . Hence, at the decision-making 
instant 7^, no intersymbol interference occurs. 
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Figure 10.3 
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Hence, 


r(f) = f y(x)p (x -b T 0 - t) dx (10.16a) 

J—oo 

At the decision-making instant t = 7T, we have 

/ oo 

y (x)p(x)dx (10.16b) 

-OC 

Because the input y(x) is assumed to start at x = 0 and p{x) =0 for x > T 0 , we have the 
decision variable 


f T * 

r (T 0 ) = f y (x)p(x)dx (10.16c) 

J o 

We can implement Eqs. (HU 6) as shown in Fig. 103, This type of arrangement, known as the 
correlation receiver, is equivalent to the matched filter receiver. 

The right-hand side of Eq. (10,) 6c) is the cross-correlation of the received pulse with 
p(J}> Recall that correlation basically measures the similarity of signals (Sec. 2.7). Thus, the 
optimum detector measures the similarity of the received signal with the pulse p(t). Based 
on this similarity measure, the sign of the correlation decides whether pit) or —p(t) was 
transmitted. 

Thus far we have discussed polar signaling in which only one basic pulse p(t) of opposite 
signs is used. Generally, in binary communication, we use two distinct pulses p{t) and q(t) to 
represent the two symbols. The optimum receiver for such a case will now' be discussed. 


10.2 GENERAL BINARY SIGNALING 

10.2.1 Optimum Linear Receiver Analysis 

In a binary scheme where symbols are transmitted every 7i seconds, the more general trans¬ 
mission scheme may use two pulses pit) and q(t) to transmit 1 and 0. The optimum linear 
receiver structure under consideration is shown in Fig. 10.4a. The received signal is 

_ p(t) + n(f) 0 < t < T b for data symbol 1 
V 1 q{t) + n(f) 0 < t < T b for data symbol 0 

The incoming signal y(f) is transmitted through a filter Hif), and the output r(/) is sampled 
at Tb* The decision of whether 0 or 1 was present at the input depends on whether is or is not 
r(7^) is less than a v , where a Q is the optimum threshold. 
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Figure 10*4 

Optimum binary 
threshold 
detech on. 



H(f) 

.. . 

Threshold 


y(/> 

<r) i 

device 

Decision: HI = 0 if r(7]f ) < « 


m - 1 if r (T b ) < a 0 
(a) 



Let p 0 {f) and q 0 {t) be the response of Hif) to inputs /7(r) and q(t), respectively. From 
Eq. (10.7) it follows that 


Po(Tb) = f P(f)H(f)^ T » df 

J-oo 

/ oo 

Q{f)H (f)e* 2n f Tb df 

-OO 


and cr n 2 , the variance, or power, of the noise at the filter output, is 




S n (f)\H(f)\ 2 df 


(10.17a) 


(10.17b) 


(10.17c) 


Without loss of generality, we let P 0 (T h ) > P 0 (T h ). Denote n as the noise output at T h . Then 
the sampler output r(7*) = q 0 1 Th ) + n or/),, ( T „) + n, depending on whether m = 0 or m = 1, 
is received. Hence, r is a Gaussian RV of variance cr 2 with mean q c { 7);i or ( T b ), depending 
on whether m = 0 or 1. Thus, the conditional PDFs of the sampled output r i ) are 


^r|m(r|0) 


Pr|iii(r|l) 


—4-exp ( 
On v 2 jt V 

1 ( 
- 7 = exp {■ 

OnV2jr \ 


[r - q 0 (T b )] 2 


2a 2 


l r-p„(T b )] 2 
2cr n 2 


) 

) 


Optimum Threshold 

The two PDFs are shown in Fig. 10.4b. If a., is the optimum threshold of detection, then the 
decision rule is 


f 0 if r < a 0 
m — { 

[1 if r > a 0 

The conditional error probability P{€ | m = 0) is the probability of making a wrong decision 
when m = 0. This is simply the areaAo undeT/? r | m (r|0) from<z 0 to oo, Similarly, ^(clm — 1) 
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is the area A i under j9 r |m(r|l) from —oo to a 0 (Fig. 10.4b), and 
P e = ^2 P( € | mi)P(mi) = ~ (A 0 + A t ) 

i 

_ 1 'q ^0 - goiTo) '^ + Q ^ Po(To) -a o y 


(10.18) 


assuming P m (0) = P m (l) = 0.5. From Fig. 10.4b it can be seen that the sum Aq + A\ of the 
shaded areas is minimized by choosing a 0 at the intersection of the two PDFs. This optimum 
threshold can also be determined directly by setting to zero the derivative of P e in Eq. (10.18) 
with respect to a G such that 


dPe 


^ q?( a o ~ \ _ q? ( PoiTp) — i 

'o 2 _ \ <7 a / \ On / On _ 


1 

" 1 

r [a 0 - q 0 (T b )] 2 ] 

i / 


2 & n 

L^„V2^ eXP 

1 

1 

ro 

A 

i_ 

exp i 

£TnV27r ' 

v 2 4 )\ 


Thus, the optimum a 0 is 


— 


PoiXb) T" QviTb) 
2 


(10,19a) 


and the corresponding P e is 

P e = P(e[0) = F(€|l) 


where we define 


o„V2^L eXP ( 

— Q ~ gpiTb) j 

_ Q ^ Po(T b ) - q 0 (T b ) ^ 

= e G)* 


[r ~ qo(T b )Y 


2m? 


dr 


P = 


PoC^h) QoiPb) 


Substituting Eq. (10.17) into Eq. (10,20), we get 


j QPtf) ~ Q(f )}H(f )eJ 2 ^ dff 


(10.19b) 

(10.19c) 


( 10 . 20 ) 
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This equation is of the same form as Eq. (10.9) with P{f) replaced by Pif) - Q{f). Hence, 
Cauchy-Schwarz inequality can again be applied to show 


*2 _ r \p(f)-Q(f)\ 2 J± . 

J-oc Jny J 

(10.21a) 

and the optimum filter H(f) is given by 


Sn if) 

(10.21b) 

where k is an arbitrary constant. 


The Special Case of White Gaussian Noise 

For white noise S n if) — M /2, and the optimum filter H [f) is given by ! 

¥■ 

H{f) = [/»(-/) - 

(10.22a) 

and 


h{t)=p{T b -t)-q{T b -t) 

(10.22b) 

This is a filter matched to the pulse p(t) — q{t ) t The corresponding ^ is 

[Eq. (10.21a)] 

2 

= y J jr<f)-Q(f)\ 2 df 

(10.23a) 


(10.23b) 

Ep + E q — 2Epq 

M/2 

(10.23c) 

where E p and E q are the energies of pit) and q(t), respectively, and 


[ T b 

E pq = / p{t)q{t) dt 

Jo 

(10.24) 


So far, we have been using the notation P e to denote error probability. In the binary case, 
this error probability is the bit error probability or bit error rate (BER) and will be denoted 
by P b (rather than P e ). Thus, from Eqs. (10.19c) and (10.23c), 


P„ = 0 (^f) (10.25a) 

- 2 (J — ■■ (10 ' 25b) 

* Because k in Eq, (10.21b) is arbitrary, we choose k = A r /2 for convenience. 
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The optimum threshold a 0 is obtained by substituting Eqs. (10.17a, b) and (10.22a) into 
Eq. (10.19a) and recognizing (via variable substitution) that 


/: 


P(f)Q(~f)df 


=£ 


P (-f)Q if) df = E, 


•pq 


(10.26) 


This gives 


a 0 = l(E p -E 9 ) (10.27) 

In deriving the optimum binary receiver, we assumed a certain receiver structure (the threshold 
detection receiver in Fig. 10.4). It is not clear yet whether there exists another structure that 
may have better performance than that in Fig. 10.4* It will be shown later (in Sec* 10*6) that for 
a Gaussian noise, the receiver derived here is the definite optimum* Equation (lQ*25b) gives 
for the optimum receiver when the channel noise is white Gaussian. For the case of nonwhite 
noise, P& is obtained by substituting ftna* from Eq* (10*21a) into Eq* (10*25a)* 

Equivalent Optimum Binary Receivers 

For the optimum receiver in Fig. 10.4a, 

Mif) = P{~f) e -j 2 ^ - Q(-f)e~j 2n ^ 

This filter can be realized as a parallel combination of two filters matched to p(t ) and q(t), 
respectively, as shown in Fig. I0.5a. Yet another equivalent form is shown in Fig. 10.5b. 
Because the threshold is (E p — E q )/2 7 we subtract E p /2 and E q j 2, respectively, from the two 
matched filter outputs* This is equivalent to shifting the threshold to 0* In the case of E p — E q , 
we need not subtract E p /2 and E q j2 from the two outputs, and the receiver simplifies to that 
shown in Fig. 10.5c. 

10.2.2 Performance Analysis of General Binary Systems 

In this section, we analyze the performance of several typical binary digital communication 
systems by applying the techniques derived in the last section for general binary receivers* 

Polar Signaling 

For (he case of polar signaling, q(t) = — p(t )* Hence, 

/ oo 

p 2 (t)dt = -E p (10.28) 

Substituting these results into Eq. (10.25b) yields 


P b = Q 


2 Ep 

jV 


(10.29) 


Also from Eq. (10.22b), 


hit) = 2p(T b - t) 


( 10 . 30 ) 
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Figure 10.5 

Realization of 
the optimum 
binary threshold 
detector. 





Recall that the multiplication of h{t) by any constant amplifies both the signal and the noise 
by the same factor, and hence does not affect the system performance. For convenience, we 
shall multiply h(t) by 0.5 to obtain 

h(t)=p(T b -t) (10.31) 

From Eq. (10.27), the threshold a 0 is 

a G =0 (10.32) 

Therefore, for the polar case, the receiver in Fig. 10.5a reduces to that shown in Fig. 10.6a 
with threshold 0. This filter is equivalent to that in Fig. 10.3. 

The error probability can be expressed in terms of a more basic parameter E bi the energy 
per bit: 

E b = energy per bit 

In the polar case, E p = E q and the bit energy E b is 

E b = E p P (m = 1 ) + E q P (m = 0) 

= E p P(m = l) + E p [Y- P(m = l)] 
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Figure 10.6 
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threshold 
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(b) its error 
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polar signaling 


h(t) = piT b -t) 


m 


5 — 7 ^ 


K 


r( T b ) 


Decision : p{t) if r > 0 
-pit) il r < 0 


(a) 



and from Eq. (10.29), 


Pb = Q 


2 E b 


(10.33) 


The parameter E b /M is the normalized energy per bit, which will be seen in future discussions 
as a fundamental parameter serving as a figure of merit in digital communication.* Because 
the signal power is equal to Eb times the bit rate, a given E b is equivalent to a given signal 
power (for a given bit rate). Hence, when we compare systems, for a given value of E by we are 
comparing them for a given signal power. 

Figure 10.6b plots P b as a function of E b /Af (in decibels). Equation (10.33) indicates 
that, for optimum threshold detection, the polar system performance depends not on the pulse 
shape, but on the pulse energy. 


On-Off Signaling 

in the case of on-off signaling, q(t) = 0, and the receiver of Fig. 10.5a can remove the lower 
branch filter of q(T b - t ). Based on Eq. (10.27), the optimum threshold for on-off signaling 
receiver is 


a 0 — E p f2 


* If the transmission rate is R b pulses per second, the signal power is Sj = E b R b , and E b fJ\f = S^/J\fR b . Observe 
that Sif£fR b is similar to the parameter y (signal-to-noise ratio used in analog systems. 
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Additionally, substituting q(t) = 0 into Eqs. (10.24) and (10.25) yields 


= 0 , 


'-pq 


0, and Pb 


= e (/I) 


(10.34) 


II both symbols m = 0 and m = 1 have equal probability 0.5, then the average bit energy is 
given by 


_ E P + E <! _ £p 

2 2 


Therefore, the BER can be written as 


?b = Q 


i® 


(1035) 


A comparison of Eqs* (10.35) and (1033) shows that on-off signaling requires exactly twice 
as much energy per bit (3 dB more power) to achieve the same performance (i.e., the same Ph) 
as polar signaling. 

Orthogonal Signaling 

In orthogonal signaling, p(t) and q(t) are selected to be orthogonal over the interval (0,7),). 
This gives 


f Tb 

Epq= / p(t)q(t)dt = 0 (1036) 

J o 

On-off signaling is in fact a special case of orthogonal signaling. Two additional examples of 
binary orthogonal pulses are shown in Fig. 103. From Eq. (10*25), 

<ia37) 


Assuming 1 and 0 to be equiprobable, 


Eb = 


Ep + E q 
2 


Figure 10«7 
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and 


Pb = Q (/f) (10 ‘ 38) 

This shows that the performance of any orthogonal binary signaling is inferior to that of polar 
signaling by 3 dB. This naturally includes on-off signaling. 


10.3 COHERENT RECEIVERS FOR DIGITAL 
CARRIER MODULATIONS 

We introduced amplitude shift keying (ASK), frequency shift keying (FSK), and phase shift 
keying (PSK) in Section 7.9. Figure 10.8 uses a rectangular baseband pulse to show the three 
binary schemes. The baseband pulse may be specifically shaped (e.g>, a raised cosine) to 
eliminate intersymbol interference and to stay within a finite bandwidth. 

BPSK 

In particular, the binary PSK (BPSK) modulation transmits binary symbols via 

1 : V2 p f {t) cos (o c t 
0 : —\/2//(0 cos o> c t 

Here p ; (t) denotes the baseband pulse shape. When p{t) = V2p f (t) cos a) c t, this has exactly 
the same signaling form as the baseband polar signaling. Thus, the optimum binary receiver 
also takes the form of Fig. )0.5a. As a result, for equally likely binary data, the optimum 
threshold a 0 — 0 and the minimum probability of detection error is identically 

Pi - e y ¥)= e {'iWj <io ' 39 > 


Figure 10.8 
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where the pulse energy is simply 


This result requires a carrier frequency sufficiently high such that f c T} } >> T 

Binary ASK 

Similarly, for binary ASK, the transmission is 

1 : */2p'(f)cos co c t 

0: 0 

This coincides with the on-off signaling analyzed earlier such that the optimum threshold 
should be a 0 = E p j 2 and the minimum BER for binary ASK is 


P 2 (t) dr 


L 

fT b 

/ ^ 

Jo 


T b 

2 I [//(f)] 2 cos 2 co c tdt 


(t)Ydt 


Eh - Q 


Eh 

N 


(10.40) 


where 



Comparison of Eq, (10.39) and Eq, (10,40) shows that for the same performance, the pulse 
energy in ASK must be twice that in PSK. Hence, ASK requires 3 dB more power than PSK. 
Thus, in optimum (coherent) detection, PSK is always preferable to ASK, For this reason, ASK 
is of no practical importance in optimum detection. But ASK can be useful in noncoherent 
systems (e.g., optical communications). Envelope detection, for example, can be applied to 
ASK, In PSK, the information lies in the phase, and, hence, it cannot be detected noncoherently. 

The baseband pulses p(t) used in carrier systems should be shaped to minimize the 1ST 
The bandwidth of the PSK or ASK signal is twice that of the corresponding baseband signal 
because of modulation,* 

Bandpass Matched Filter as a Coherent Receiver 

For both PSK and ASK, the optimum matched filter receiver of Fig. 10,5a can be implemented. 
As shown in Fig. 10,9a, the received RF pulse can be detected by a filter matched to the RF 
pulse p{t) followed by a sampler before a threshold detector. 

On the other hand, the same matched filter receiver may also be modified into Fig, 10,9b 
without changing the signal samples for decision. The alternative implementation first demod¬ 
ulates the incoming RF signal coherently by multiplying it with V2cos a> c t. The product is 


We can also use QAM (quadrature multiplexing) to double bandwidth efficiency. 
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Figure 10*9 

Coherent 
detection of 
digital 
modulated 
signals. 
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Figure 10*10 

Optimum 
coherent 
detection of 
binary FSK 
signals. 



the baseband pulse* p'(f) plus a baseband noise with PSD Af/2 (see Example 9.13), and this is 
applied to a filter matched to the baseband pulse//(f)-The two receiver schemes are equivalent. 
They can also be implemented as correlation receivers. 

Frequency Shift Keying 

In FSK, RE binary signals are transmitted as 

0 ; V2 p f (t) cos [a) c - (Aco/2)]t 
1 : cos [a) c + (Aco/2)]t 

Such a waveform may be considered to be two interleaved ASK waves. Hence, the PSD will 
consist of two PSDs, centered at | f c — (A//2)] and | f c + (A//2)]. For a large A f /f c , the PSD 
will consist of two nonoverlapping PSDs. For a small Af/f cy the two spectra merge, and the 
bandwidth decreases. But in no case is the bandwidth less than that of ASK or PSK. 

The optimum correlation receiver for binary FSK is given in Fig. 10.10. Because the pulses 
have equal energy, when the symbols are equally likely, the optimum threshold a 0 — 0. 

Consider the rather common case of rectangular p f {t) — A , that is, no pulse shaping 
in FSK. 


q(t) 

pit) 



AoA 

~) 

A&A 


t 


t 


There is also a spectrum of p\t) centered at 2cd c , which is eventually eliminated by the filter matched top' (*)■ 
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To compute P b from Eq* (10.25b), we need E pq , 


Epq — I 
JO 

2 f Ib ( / Aoj\ 

— 2A j cos yco c -— 1 1 cos I co c H- Jt 

? r f n f n i 

— A yj cos(Aco)tdt + J cos 2 (o c tdt 

2 [sin (A a>)T b sin 2a> c T b ~ 

— A 1 h --- 

L 2to c r b J 


In practice co c T b » 1, and the second term on the right-hand side can be ignored. Therefore, 


Epq = A l T h sine (A coT b ) 


Similarly, 


f T t 

Eh = E p -E q = I [p(t)] 2 dt = A 2 T b 
Jo 

The BER analysis of Eq, (10.25b) for equiprobable binary symbols 1 and 0 becomes 


Pb = Q 


(E b -E b sine ( Ao)T b ) 


It is therefore clear that to minimize P b , we should select A<o for the binary FSK such that 
sine (AajT b ) is minimum. Figure 10.11a shows sine (A coT b ) as a function of (AcuT b ). The 
minimum value of E pq is -0*217 A 2 T b at Aa>*T b = 1.43jt or when 

^ Ary 0.715 

a/= & =— =oii5Rb 

This leads to the minimum binary FSK BER 




(10.41a) 


When Epq = 0, we have the case of orthogonal signaling. From Fig. 10.1 la, it is clear 
that E pq = 0 for A f = n/2Tt, where n is any integer. Although it appears that binary FSK 
can use any integer n when selecting A f, larger Af means wider separation between signaling 
frequencies co c — (Aoj/2) and co c + (Ado/2), and consequently larger transmission bandwidth. 
To minimize the bandwidth, Af should be as small as possible. Based on Fig. 10.1 la, the 
minimum value of Af that can be used for orthogonal signaling is 1 /2Tb, FSK using this value 
of Af is known as minimum shift keying (MSK). 

Minimum Shift Keying 

In MSK, not only are the two frequencies selected to be separated by l/27£,, but we should 
also take care to preserve phase continuity when switching between/ ± Af at the transmitter. 
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Figure 10.11 

(a) The minimum 
ol the sine 
function and 

(b) the MSK 
spectrum. 




This is because abrupt phase changes at the bit transition instants when we are switching 
frequencies would significantly increase the signal bandwidth. FSK schemes maintaining phase 
continuity are known as continuous phase FSK (CPFSK), of which MSK is one special case. 
These schemes have rapid spectral roll-off and better spectral efficiency. 

To maintain phase continuity in CPFSK (or MSK), the phase at every bit transition is made 
to depend on the past data sequence. Consider, for example, the data sequence 1001. .. starting 
at t — 0 , The first pulse corresponding to the first bit 1 is cos [ov + (Aco/2)]t over the interval 
0 to T& seconds. At r = TV this pulse ends with a phase fcu (: + (Auj/2)17V The next pulse, 
corresponding to the second data bit 0, is cos [a> c — (A^/2)]r. To maintain phase continuity 
at the transition instant, this pulse is given additional phase (a> c + Ato) TV We achieve this 
continuity at each transition instant kT&. 

MSK being an orthogonal scheme, its error probability is given by 


Pb = Q 


Et 

Af 


(10.41b) 


Although this performance appears inferior to that of the optimum case in Eq. (10.41a), closer 
examination tells a different story. Indeed, this result is true only if MSK is coherently detected 
as ordinary FSK using an observation interval of 7V However, recall that MSK is CPFSK, 
where the phase of each pulse is dependent on the past data sequence. Hence, better performance 
may be obtained by observing the received waveform over a period longer than TV Indeed, 
it can be shown that if an MSK signal is detected over an observation interval of 27V then the 
performance of MSK is identical to that of optimum PSK, that is, 



Pb = Q 


(10.41c) 
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MSK also has other useful properties. Tt has self-synchronization capabilities and its 
bandwidth is only 1.5 R&, as shown in Fig. 10-1 lb* This is only 50% higher than for duobinary 
signaling. Moreover, the MSK spectrum decays much more rapidly as l// 4 , in contrast to the 
PSK (or bipolar) spectrum, which decays only as 1 / f 2 [see Eqs. (7.15) and (7.22)]. Because of 
these properties, MSK has received a great deal of practical attention. For more discussions, 
see Refs. 1 and 2. 


10.4 SIGNAL SPACE ANALYSIS OF 
OPTIMUM DETECTION 

Thus far, our discussions on digital receiver optimization have been limited to the simple case 
of linear threshold detection for binary transmissions under Gaussian channel noise. Such 
receivers are constrained by their linear structure. To determine the truly optimum receivers, 
we need to answer the question; Given an M -ary transmission with channel noise n(r) and 
channel output 


y(0 = Pi(t) + n(/) 0 < r < T 0 i = 1, ..., M 

what receiver is optimum that can lead to minimum error probability? 

To answer this question, we shall analyze the problem of digital signal detection from 
a more fundamental point of view. Recognize that the channel output is a random process 
y(0, 0 < r < T 0 . Thus, the receiver must make a decision by transforming y(t) into a 
finite-dimensional decision space. Such an analysis is greatly facilitated by a geometrical 
representation of signals and noises, 

A Word about Notation: Let us clarify the notations used here to avoid confusion. As 
before, we use roman type to denote an RV or a random process [e.g„ x or x(f)], A particular 
value assumed by the RV in a certain trial is denoted by italic type. Thus, * represents the value 
assumed by x. Similarly, x(t) represents a particular sample function of the random process 
x(/). For random vectors, we follow the same convention: a random vector is denoted by roman 
boldface type, and a particular value assumed by the vector in a certain trial is represented by 
boldface italic type. Thus, r denotes a random vector, but r is a particular value of r. 

10.4.1 Geometrical Signal Space 

We now formally show that a signal in an M-ary transmission system is in reality an n- 
dimensional vector and can be represented by a point in an ^-dimensional hyperspace (n < M ). 
The foundations for such a viewpoint were first laid during the introduction of the signal space 
in Sec. 2.6. 

To begin, an ordered rc-tuple (x[, xj, . x n ) is an ^-dimensional vector jc. The 
^-dimensional (signal) vector space is spanned by n unit vectors <p 2 , - ♦ - ,tp n 

*]=(!. 0 , 0 ,..., 0 ) 

9i = ( 0 , 1 , 0 >. .., 0 ) 

= (0, o, o,... ,i> 


( 10 . 42 ) 
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Any vector* = (jq, * 2 , ■ - •. *«) can be expressed as a linear combination of n unit vectors, 

x = *1(01 +*2(02 +-b* n (0 fi (10.43a) 

n 

= (10.43b) 

k = 1 

This vector space is characterized by the definitions of the inner product between two 
vectors 


H 

<x, y> = ^ x k y k 

k=\ 


(10.44) 


and the vector nonn 


tt 

ii*ii 2 = <*,*> 

k =l 


(10.45) 


The norm ||jic|| is the length of a vector. Vectors x and y are said to be orthogonal if their inner 
product 


<x, y> = 0 


(10.46) 


A set of ^-dimensional vectors is said to be independent if none of the vectors in the 
set can be represented as a linear combination of the remaining vectors in that set. Thus, if 
y 1, y~ 2 r ■■■ an independent set, then the equality 


GOT +G2J2 + ' h = 0 (10.47) 

would require that a\ = 0J = 1, ..., m. A subset of vectors in a given rt-dimensional space 
can have dimensionality less than «. For example, in a three-dimensional space, all vectors 
lying in one plane can be specified by two dimensions, and all vectors lying along a line can 
be specified by one dimension. 

An rc-dimensional space can have at most n independent vectors. If a space has a maxi¬ 
mum of n independent vectors, then eveiy vector x in this space can be expressed as a linear 
combination of these n independent vectors. Thus, any vector in this space can be specified 
by ^-tuples. For this reason, a set of n independent vectors in an a- dimensional space can be 
viewed as its basis vectors. 

The members of a set of basis vectors form coordinate axes, and they are not unique. The 
n unit vectors in Eq. (10.42) are independent and can serve as basis vectors. These vectors have 
an additional property In that they are (mutually) orthogonal and have normalized length, 
that is, 


fo 

<9j>9k> = ! 


j 

J = k 


(10.48) 



10.4 Signal Space Analysis of Optimum Detection 527 

Such a set is an orthonormal set of vectors. They capture an orthogonal vector space. Any 
vector* = (*], * 2 ,. . * j x n ) can be represented as 

x=x\<p } +x 2 <p 2 -\ - Vx n <p }1 

where is the projection of * on the basis vector <p k and is the kth coordinate. By using 
Eq. (10.48), the kth coordinate can be obtained from 

<*,9k>=Xk k = 1, 2,.. n (10.49) 

Since any vector in the ^-dimensional space can be represented by this set of n basis vectors, 
this set forms a complete orthonormal (CON) set. 

10.4.2 Signal Space and Basis Signals 

The concepts of vector space and basis vectors can be generalized to characterize continuous 
time signals defined over a time interval 0. As described in Sec. 2.6, a set of orthonormal 
signals {<p;(0} can be defined for t e 0 if 

^ 0 <P]U)<Pk(t) dt = | | l ^ * ( 10 . 50 ) 

If {<Pi (0) form a complete set of orthonormal basis functions of a signal space defined over 0, 
then every signal *(f) in this signal space can be expressed as 

*(0 = ]T*m(0 f£0 (io.5i) 

k 

where the signal component in the direction of is* 

*k = / x{t)<p k {t)dt (10.52) 

One such example is for 0 = (—oc, oo). Based on sampling theorem, all low-pass signals 
with bandwidth B Hz can be represented by 

x(t) = V2 B sine (2 nBt — kn) (10.53a) 

--' 

k <PM 


* If (^(r)} Is complex, orthogonality implies 


L 


ds = 0 


Xk = J *(<)*>£ W* 


and Eq. (10.52) becomes 
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with 


x k = 



sine (IjtBt - kn)dt = 



(10.53b) 


Just as there are an infinite number of possible sets of basis vectors lor a vector space, there are 
an infinite number of possible sets of basis signals for a given signal space. Ft>r a band-limited 
signal space, {*JlB ■ sine (2 tt Bt — £tt)} is one possible set of basis signals. 

Note that a- ( k/2B) are the Nyquist rate samples of the original band-limited signal. Since a 
band-limited signal cannot be time-limited, the total number of Nyquist samples needed will be 
infinite. Samples at large k, however, can be ignored, because their contribution is negligible. 
A rigorous development of this result, as well as an estimation of the error in ignoring higher 
dimensions, can be found in Landau and Poliak. 3 


Scalar Product and Signal Energy 

In a certain signal space, let A(r) and y(f) be two signals. If [<p k (/}} are the orthonormal basis 
signals, then 


x(t) = ^XWiU) 

S 

>'( o = 

j 


Hence, 


<*('0. }’(f)> = / x(t)y(t)dt- f 


y> m <r) 


£w( f > 


dt 


Because the basis signals are orthonormal, we have 



a (t)y(t)dt = ^ a k y k 
k 


(10.54a) 


The right-hand side of Eq. (10.54a), however, is by the inner product of vectors x and y. 
Therefore, we again arrive at ParsevaTs theorem, 

<A(r), y(t)> = / x(t)y(t) dt = Tx k y k = <a, y> (10.54b) 

Jte® “ 

The signal energy for a signal x{r) is a special case. The energy E x is given by 

E x = / a 2 {t)dt 

= <a, x> = ||a 11 2 (10.55) 


Hence, the signal energy is equal to the square of the length of the corresponding vector. 
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Example 1 0.1 A signal space consists of four signals a i (f). S 2 (f), and 54(0, as shown in Fig. 10J2< 
Determine a suitable set of basis vectors and the dimensionality of the signals. Represent these 
signals geometrically in the vector space. 


Figure 10*12 

Signals and their 
representation in 
signal space. 



The two rectangular pulses ip\{t) and^(0 in Fig. 10.12b are suitable as a basis signal set. 
In terms of this set, the vectors s \, S 2 * 53, and $4 corresponding to signals S[(t), S 2 (t), .*3(0. 
and j 4 (0 are s\ = (1, -0,5), S 2 — (-0.5, 1), £3 = (0, -1), and $4 = (0,5, 1), These 
points are plotted in Fig. 10T2c, Observe that the inner product between and S 4 is 

<$1, $4> — 0.5 - 0,5 = 0 


Hence, s\ and s 4 are orthogonal. This result may be verified from the fact that 

/ OG 

si0)‘*4{0 dt — 0 

-OO 

Note that each point in the signal space in Fig. 10J 2c corresponds to some waveform. 


Determining an Orthonormal Basis Set 

If there are a finite number of signals Xi (r) in a given signal set of interest, then the orthonormal 
signal basis can either be selected heuristically or systematically. Aheuristic approach requires a 
good understanding of the relationships among the different signals as well as a certain amount 
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of luck. On the other hand, Grani-Schmidt orthogonalization is a systematic approach to 
extract the basis signals from the known signal set. The details of this approach are given in 
Appendix C. 

10.5 VECTOR DECOMPOSITION OF WHITE NOISE 
RANDOM PROCESSES 

In digital communications, the message signal is always one of the M possible waveforms. Tt 
is therefore not difficult to represent all M waveforms via a set of CON basis functions. The 
real challenge , in fact, lies in the vector decomposition of the random noise n(t) at the receiver. 
A deterministic signal can be represented by one vector, a point in a signal space, is it possible 
to represent a random process as a vector of random variables? Tf the answer is positive, then 
the detection problem can be significantly simplified. 

Consider a complete orthonormal (CON) set of basis functions {<p fc (r)) for a signal space 
defined over [0, 7^]. Then any deterministic signal y(r) in this signal space will satisfy the 
following condition: 


l 


To 


wo - = o 

k 


(10.56a) 


This implies that for t e [0, T 0 \, we have the equality* 

■*(0 = ^* fepjfe (0 
it 


However, for random processes defined over [0, T a \, this statement is generally not true. 
Certain modifications are necessary. 

10.5.1 Determining Basis Functions for a Random Process 


First of all, a general random process x(f) cannot strictly satisfy Eq. (10,56a). Instead, a proper 
convergence requirement is in the mean square sense, that is, 



r T <> 


1 ' 

E 

L 

x(f) - 

k 

dt 


This equality can be denoted as 


(10.56b) 


X(0 = S '£> m (0 (10.56c) 

k 

If x(f) and y(f) are equal in the mean square sense, then physically the difference between 
these two random processes have zero energy. As far as we are concerned in communications, 
signals (or signal differences) with zero energy have no physical effect and can be viewed as 0, 


+ Strictly speaking, this equality is true not for the entire interval [0, T 0 \. The set of points for which equality does 
not hold is a measure zero set. 
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For a set of deterministic signals, the basis signals can be derived via the Gram-Schvnidt 
orthogonalization procedure. However, Gram-Schmidt is invalid for random processes. 
Indeed, a random process \(t) is an ensemble of signals. Thus, the basis signals must 

also depend on the characteristics of the random process. 

The full and rigorous description of the decomposition of a random process can be found 
in some classic references. 4 Here, it suffices to state that the orthonormal basis functions must 
be solutions of the following integral equation 

*r<pi(*)= [ R*(t, ri) ♦ <Pi(h)dti 0 <r<T 0 (10.57) 

Jo 

The solution Eq, (10.57) is known as the Karkunen-Loeve expansion. The auto-correlation 
function /?*(/, t\) is known as its kernel function. Indeed, Eq. (10.57) is reminiscent of the 
linear algebra equation with respect to eigenvalue k and eigenvector 0: 

A0 = R x 0 

in which 0 is a column vector and R K is a positive semidefinite matrix; kt are known as the 
eigenvalues, whereas the basis functions <p/(0 are the corresponding eigenfunctions. 

The Karhunen-Loeve expansion clearly establishes that the basis functions of a random 
process x(t) depend on its autocorrelation function # x (f, fi). We cannot arbitrarily select a 
CON function set. In fact, solving the Karhunen-Loeve expansion can be a nontrivial task. 

10.5.2 Geometrical Representation of 
White Noise Processes 

For a stationary white noise process x(r), the autocorrelation function is luckily 

A r 

*x(r, f i) — yfitf ~ ( i) 

For this special kernel , the integral equation Eq, (10.57) is reduced to a simple form of 

' <Pi(0 = f ° - ti) ■ <Pi{t\)dt i = t € (0, T n ) (10.58) 

Jo 2 2 

This result implies that any CON set of basis functions can be used to represent stationary 
white noise processes. Additionally, the eigenvalues are identically ki = J\f / 2. 

This particular result is of utmost importance to us. In most digital communication appli¬ 
cations, we focus on the optimum receiver design and performance analysis under white noise 
channels. In the case of M-ary transmissions, we have an orthonormal set of basis functions 
{pjt(r)} to represent the M waveforms {i7(r)}, such that 

s;(0 = £w*(0 / = 1. M (10.59a) 

k 

Based on Eq. (10.58), these basis functions are also suitable for the representation of the white 
channel noise n w (t) such that 

M0 = 8, £ n m<0 0<(<r c (io.59b) 

Jt 
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Consequently, when the transmitter sends s, (0, the received signal can be decomposed into 


y(0 = si{t) + MO 

=*' ^ sijk<Pk(t) + ^njKPjt(f) 
k k 

= S '£.v*M0 (10.59c) 

k 


by defining 


y(t = / y(0 <PkO)dt = Stf + nit if Si(t) is sent (10.59d) 

Jo 

As a result, when the channel noise is white, the received channel output signal can be 
effectively represented by a sequence of random variables {y*} of Eq. (10.59d), In other words, 
the optimum receiver for white noise channels can be derived from the information contained in 

{yi. yi .yjt, ■■■)■ 

We note that white noise x(r) consists of an ensemble of sample functions. The coefficients 

fa 

xjt = / x(t)<p%(t)dt k = 1, 2 , ... 

Jo 

in the decomposition of Eq. (10,59b) will be different for each sample function. Consequently, 
the coefficients are RVs. Each sample function will have a specific vector (*i, X 2 ,.. .,x R ) and 
will map into one point in the signal space. This means that the ensemble of sample functions 
for the random process x(r) will map into an ensemble of points in the signal space, as shown 
in Fig, 10,13, Although this figure shows only a three-dimensional graph (because it is not 
possible to show a higher dimensional one), it is sufficient to indicate the idea. 

For each trial of the experiment, the outcome (the sample function) is a certain point x , 
The ensemble of points in the signal space appears as a dust ball, with the density of points 
directly proportional to the probability of observing x in that region. If we denote the joint 
PDF of xi, X 2 , ,,., x„ by p x (x), then 

P*(x) =p* l x2-* a 0 c 1 ** 2 , (10,60) 





10.5 Vector Decomposition of White Noise Random Processes 533 


Thus, p x (*) has a certain value at each point in the signal space, and /? x (a;) represents the 
relative probability (dust density) of observing x = x. 


10.5.3 White Gaussian Noise 

If the channel noise n w (t) is white and Gaussian, then from the discussions in Section 8.6, the 
expansion coefficients 


n k = 



n w (t)tp k (t)dt 


(10.61) 


are also Gaussian. Indeed, (ni, U 2 .njt,...) are jointly Gaussian. 

Here, we shall provide some fundamentals on Gaussian random variables. First, we define 
a column vector of n random variables as 


*1 

*2 


Note thatje r denotes the transpose of x. and x denotes the mean of x. Random variables (RVs) 
xi, X 2 , ^ , x H are said to be jointly Gaussian if their joint PDF is given by 


Px,x 2 ..,x fl (*], *2.*n) = 


(2jr)"/Vdet(X x ) 


exp 


--(x-x) t K~'(x-x) 


(10.62) 


where K x is the n x n covariance matrix 




<712 ' ' 

’ &\n 

K* = (x - x) ■ (x - x) T = 

<721 

<722 ■ ' 

' °ln 


J*n 1 

&n2 ■ 1 

1 &nn 

and the covariance of x; and x, is 




<fij = (x; ■ x,)(xj - 




(10.63a) 


(10.63b) 


Here, we use conventional notations det (tf x ) and K~ l to denote the determinant and the inverse 
of matrix K respectively. 

Gaussian variables are important not only because they are frequently observed, but also 
because they have certain properties that simplify many mathematical operations that are 
otherwise impossible or very difficult. We summarize these properties as follows: 


P-1: The Gaussian density is completely specified by only the first- and second-order statistics 
x and K x . This follows from Eq, (10.62). 

P-2: If n jointly Gaussian variables x*, X 2 , ..., x n are uncorrelated, then they are independent. 
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If the n variables are uncorrelated, — 0 (i ^ ;'}, and K x reduces to a diagonal 
matrix. Thus, Eq. (10.62) becomes 


n 

P*. ,x 3 -x n (*l, X2 . X„) = jq 

i=l 



<Xj - M) 

la} 


= Px^x^p^ixz).. .pn„(x n ) 


(10.64a) 


(10.64b) 


As we observed earlier, independent variables are always uncorrelated, but uncorrelated 
variables are not necessarily independent. For the case of jointly Gaussian RVs, however, 
uncorrelatedness implies independence. 

P-3: When xj, X 2 , ..., x H are jointly Gaussian, all the marginal densities, such as p Xi (Xi), and 
all the conditional densities, such as x p {xi<> xi,..., x p ) y are Gaussian. 

This property can be readily verified (Prob. 8.2-9). 

P-4: Linear combinations of jointly Gaussian variables are also jointly Gaussian. Thus, if we 
form m variables y i, ys,..., (m < n) obtained from 


n 

y i = ^ a ik*k (10.65) 

k= 1 

then yi, y 2 , -.., y m are also jointly Gaussian variables. 


10.5.4 Properties of Gaussian Random Process 

A random process x(f) is said to be Gaussian if the RYs x(fi), xfo), ■ ■ ■ > x(t n ) are jointly 
Gaussian [Eq. (10.62)] for every n and for every set (tu tj, ■ ■ -, 60- Hence, the joint PDF of 
RVsx(*i), x(f 2 ), ■ ■ ■ > x(f H ) of a Gaussian random process is given by Eq. (10.62) in which the 
mean and the covariance matrix K x are specified by 


x(ti) and o {j = tj) - MU) ■ *(tj) (10.66) 

This shows that a Gaussian random process is completely specified by its autocorrelation 
function fl x (r/, tj) and its mean value x(r). 

As discussed in Chapter 9, if the Gaussian random process satisfies two additional 
conditions: 


RxUly tj) — R\(tf — tj) 


(10.67a) 


and 


x(r) = constant for all t (10.67b) 

then it is a wide-sense stationary process. Moreover, Eqs, (10.67) also mean that the joint PDF 
of the Gaussian RVs x(fi), x(f 2 ), ■ ■., x(t n ) is also invariant to a shift of time origin. Hence, 
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we can conclude that a wide-sensc stationary Gaussian random process is also strict-sense 
stationary. 

Another significant property of the Gaussian process is that the response of a linear system 
to a Gaussian process is also a Gaussian process. This arises from property P-4 of the Gaussian 
RVs* Let x(0 be a Gaussian process applied to the input of a linear system whose unit impulse 
response is /?(/). If y(f) is the output (response) process, then 

y (0 = f x(/ - x)h(z)dz 

J-oo 

oo 

= lim y x(f — kAz)h(kAz) Ar 

k=— DO 


is a weighted sum of Gaussian RVs* Because x(r) is a Gaussian process, ail the variables 
x(t - kAz) are jointly Gaussian (by definition). Hence, the variables y(*i), y(f 2 ) ? * >,, y(f„) 
for all n and every set (ri ? t 2 , . *., t n ) are linear combinations of variables that are jointly 
Gaussian. Therefore, the variables y(^), y{/ 2 ), .... y(r„) must be jointly Gaussian, according 
to the earlier discussion. It follows that the process y(t) is a Gaussian process. 

To summarize, the Gaussian random process has the following properties: 

1. A Gaussian random process is completely specified by its autocorrelation function and mean 
value. 

2. If a Gaussian random process is wide-sense stationary, then it is stationary in the strict 
sense. 

3* The response of a linear system to a Gaussian random process is also a Gaussian random 
process* 

Consider a white noise process n w (r) with PSD A r /2. Then any complete set of orthonormal 
basis signals <p\(t) f <p 2 (t), ., * can decompose n w (0 into 


MO = n^i(0 + n 2 M0 H- 

= 

k 

White noise has infinite bandwidth* Consequently, the dimensionality of the signal space is 
infinity* 

We shall now show that RVs n^ n 2 , < *. are independent, with variance Mjl each. First, 
we have 


n ,n k 


fto 

/ n w (a)<pj(a) da / n w (p)<p k ($) dfi 
J o J o 

f T « f T v 

/ / n, v (a)nAfi)Vj(®)<Pk(P) da dp 

J o J o 

fT 0 eT n 
JO JO 


Rn w (P - ce)<Pj(oc)<Pk(fi) da dp 
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Because /? n „ (r) = ('A /2) 5(n. then 


n ; n* 



AT. 


- a)<pj(a)<pt.(fi) da dji 


N C T ° 

— f <pj(a)<pk (a) da 
2 JO 


0 j ¥= k 

I J-* 


( 10 . 68 ) 


Hence, n ] and n* are uncorrelated Gaussian RVs, each with variance N}2. Since they are 
Gaussian, uncorreiatedness implies independence. This proves the result. 

For the time being, assume that we are considering an N-dimensional case. The joint PDF 
of independent joint Gaussian RVs ni, m, ... , n,v ? each with zero mean and variance J\f j2 , 
is [see Eq. (10.64)] 


Mw) = ri^_ e -«, 2 /2(A72) 
1 V2jtAV2 


{jzU) n ^ 

1 

{kN) n ^ 


- \-n%)/A r 


(10,69a) 

(10.69b) 


This shows that the PDFp n (/i) depends only on the norm \\n\\ t which is the sampled length 
of the noise vector n in the hyperspace, and is therefore spherically symmetrical if plotted in 
the Af-dimensional hyperspace. 


10.6 OPTIMUM RECEIVER FOR WHITE GAUSSIAN 
NOISE CHANNELS 

10.6.1 Geometric Representations 

We shall now consider, from a more fundamental point of view, the problem of M-ary commu¬ 
nication in the presence of additive white Gaussian noise (AWGN). Such a channel is known 
as the AWGN channel. Unlike the linear receivers previously studied in Secs. 10.1 to 10.3, 
no constraint is placed on the optimum structure. We shall answer the fundamental question: 
What receiver will yield the minimum error probability? 

The comprehension of the signal detection problem is greatly facilitated by geometrical 
representation of signals. In a signal space, we can represent a signal by a fixed point (or a 
vector). A random process can be represented by a random point (or a random vector). The 
region in which the random point may lie will be shown shaded, with the shading intensity 
proportional to the probability of observing the signal in that region. In the M-ary scheme, 
we use M symbols, or messages, mi, m;,..., ■ Each of these symbols is represented by 

a specified waveform. Let the corresponding waveforms be S} { t ), siit ),., ♦, sm (0* Thus, the 
symbol (or message) m k is sent by transmitting the waveform s k (t). These waveforms are 
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Figure 10.14 

Mary communi 
cation system. 




corrupted by AWGN n „(t) (Fig. 10.14) with PSD 

s , , AT 

5 ^( 0 )) = — 

At the receiver, the received signal r(r) consists of one of the M message waveforms s k (r) 
plus the channel noise, 

r(r) = s k (t) + n w (f) (10.70a) 

Because the noise n*{f) is white, we can use the same basis functions to decompose both 
s k (t) and n w (t). Thus, we can represent r(r) in a signal space by denoting r, s*, and n H , as the 
vectors representing signals r(r), and n w (t) y respectively. Then it is evident that 

r = Sk -h u w (10,70b) 

The signal vector s k is a fixed vector, because the waveform j*{ 0 is nonrandom, whereas the 
noise vector n w is random. Hence, the vector r is also random. Because n w (t) is a Gaussian 
white noise, the probability distribution of n w has spherical symmetry in the signal space {as 
shown in the last section). Hence, the distribution of r is a spherical distribution centered at a 
fixed point 5*, as shown in Fig. 10.15. Whenever the messages is transmitted, the probability 
of observing the received signal r(f) in a given scatter region is indicated by the intensity of the 
shading in Fig. 10* 15* Actually, because the noise is white, the space has an infinite number of 
dimensions. For simplicity, however, we have shown the space to be three-dimensional. This 
will suffice to indicate our line of reasoning. We can draw similar scatter regions for various 
points 5i, S 2 .5 m* Figure 10* 16a shows the scatter regions for two messages mj and nik 
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Figure 10.16 

Binary commu¬ 
nication in the 
presence of 
noise. 




when Sj and are widely separated in signal space. In this case, there is virtually no overlap 
between the two scattered regions. If either mj or m k is transmitted, the received signal will lie 
in one of the two scatter regions. From the position of the received signal, one can decide with 
a very small probability of error whether mj or m k was transmitted. In Fig. 10.16a, the received 
signal r is much closer to s k than to sj. It is therefore more likely that m k was transmitted. 
Note that theoretically each scatter extends to infinity, although the probability of observing 
the received signal diminishes rapidly as a point is scattered away from the center, Bence, 
there will always be some overlap between the two scatter sets, resulting in a nonzero error 
probability. Thus, even though the received r is much closer to $ k in Fig. 10.16a, it may still 
be generated by sj plus channel noise. 

Figure 10.16b illustrates the case of stronger noise. In this case, there is a considerable 
overlap between the two scattered regions. Because the received signal r is closer to sj than 
to s*, it is more likely that mj was transmitted. But in this case there is also a considerable 
probability that m k may have been transmitted. Hence in this situation, there will be a much 
higher probability of emir in any decision scheme. 

The optimum receiver must decide, from a knowledge of r, which message has been 
transmitted. The signal space must be divided into M nonoverlapping, or disjoint, decision 

regions R[, R 2 . Rm , corresponding to the M messages m 1( m2.mil/, If r falls in the 

region R kf the decision is m k . The problem of designing the receiver then reduces to choosing 
the boundaries of these decision regions/?], R 2 , . Rm to minimize the probability of error 
in decision making. 

To recapitulate: A transmitter sends a sequence of messages from a set of M 
messages m \, .,,, niM ■ These messages are represented by finite energy waveforms 

(t), * 2 ( 0 , ■ ■ ■ > sm (t). One waveform is transmitted every T 0 = Tm seconds. We assume that 
the receiver is time™synchronized with the transmitter. The waveforms are corrupted during 
transmissions by an AWGN of PSD A/72, Knowing the received waveform, the receiver must 
decide which waveform was transmitted. The merit criterion of the receiver is the minimum 
probability of error in making this decision. 


10.6.2 Dimensionality of the Detection Signal Space 

Let us now discuss the dimensionality of the signal space in our detection problem. If there was 
no noise, we would be dealing with only M waveforms ■ ■ ■ > smU). In this case 

a signal space of, at most, M dimensions w ? ould suffice. This is because the dimensionality 




10,6 Optimum Receiver for White Gaussian Noise Channels 539 

of a signal space is always equal to or less than the number of independent signals in the 
space (Sec. 10.4). For the sake of generality, we shall assume the space to have N dimensions 
(N < M). Let (r), + ■ ■ * <Pn( 0 be the orthonormal basis set for this space. Such a set 

can be constructed by using the Gram-Schmidt procedure discussed in Appendix C. We can 
then represent the signal waveform 5^{/) as 

*j(t) = sj.m (0 + J/,2¥ 3 2(0 4- ■ ■ - + s ]i N<pN(t) (10.71a) 

N 

= &\m(0 7 = 1.2. M (10.71b) 

Jt=l 


where 


sjjc- I Sj{t)<pk(t)dt (10.71c) 

Jtm 

Now consider the white Gaussian channel noise n w (r). This signal has an infinite bandwidth 
(B = oo). It has an infinite number of dimensions and obviously cannot be fully represented 
in a finite A-dimensional signal space discussed earlier. We can, however, split n w (t) into 
two components: (1) the portion of n w (f) inside the N -dimensional signal space, and (2) the 
remaining component orthogonal to the AT-dimensional signal space. Let us denote the two 
components by n(r) and no(0* Thus, 


where 


M0 = n(r) +n 0 (0 


n (0 = (0 

Jt=i 


(10.72) 


(10.73a) 


and 


"o(0 = ^ nj<pj(t) 

k^N+l 


where 


n j = / n(t)<Pj 

Jtm 


(t)dt 


(10.73b) 


(10.73c) 


Because no(f) is orthogonal to the A-dimensional space, it is orthogonal to every signal in that 
space. Hence, 


/ T\ 0 (t)(pj(t)dt = 0 7 = 1,2. 

Jt m 


N 


Therefore, 


«v = / I 


[n(<) + no(0]^(f) dt 


= l n j = 1,2,.. .,N 

JTa f 


(10.74) 
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From Eqs. (10.73a) and (10.74), it is evident that we can filter out the component no(r) from 
n w (t). This can be seen from the fact that the received signal, r(f)> can be expressed as 

r (0 = Sk(t) 4- n*v{/) 

= $k(t) H- n(f) 4- no(f) 

= q(f) + n 0 (f) (10*75) 

where q{f) is the projection of r(f) on the AT-dimension a I space; 

q(0 = ^(0 + n(f) (10.76) 

We can obtain the projection q(f) from r(f) by observing that [see Eqs. (10.71b) and (10.73a)] 


jV 

q(0 = X! ( % +n ^^ (f) (10.77) 

j=1 

From Eqs. (10.71c), (10.74), and (10*77) it follows that if we feed the received signal r (t) 
into the system shown in Fig. 10,17, the resultant outcome will be q(f)* Thus, the orthogonal 
noise component can be filtered out without disturbing the message signal. 

The question here is: Would such filtering help in our decision making? We can easily show 
that it cannot hurt us. The noise n H ,(r) is independent of the signal waveform s*{f). Therefore, 
its component no(f) is also independent of .5^(0* Thus, no(f) contains no information about the 
transmitted signal, and discarding such a component from the received signal r(r) will not cause 
any loss of information regarding the signal waveform ^(0- This, however, is not enough* 
We must also make sure that the noise being discarded [no(0] is not in any way related to the 
remaining noise component n (t). If no(f) and n(0 are related in any way, it will be possible to 
obtain some information about n (t) from no(0, thereby enabling us to detect that signal with 
less error probability* If the components no(0 and n(t) are independent random processes, the 
component no(f) does not carry any information about n (t) and can be discarded. Under these 
conditions, «o(0 is irrelevant to the decision making at the receiver* 

The process n(r) is represented by components nj, U 2 , *,,, along tj> 2 {*), ■ ■ ■ > 
<PaK0 ? and no(0 is represented by the remaining components (infinite number) along the 
remaining basis signals in the complete set, Because the channel noise is white 

Gaussian, from Eq, (10,68) we observe that all the components are independent. Hence, 


Figure 10*17 

Eliminating the 
noise orthogonal 
to signal space. 
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the components representing no(f) are independent of the components representing n(0- 
Consequently, no(r) is independent of n(f) and contains only irrelevant data* 

The received signal r(f) is now reduced to the signal q(r), which contains the desired signal 
waveform and the projection of the channel noise on the AT-dimensional signal space. Thus, 
the signal q(f) can be completely represented in the signal space. Let the vectors representing 
n(f) and q(r) be denoted by n and q* Thus, 


q = s + n 


where s may be any one of vectors ^i, $2, ■ ■ 

The random vector n = (ni, nj, ..., n^) is represented by N independent Gaussian 
variables, each with zero mean and variance a n 2 = A/72, The joint PDF of vector n in such a 
case has a spherical symmetry, as shown in Eq* (10*69b), 


Pn(n) = 


1 

(ttA 


i|K|l 2 /A' 


(10.78a) 


Note that this is actually a compact notation for 


n 2 .— 


^ r -(fi?+^+-+n;,)/A'~ 

(ttAO */ 2 


(10.78b) 


10.6.3 (Simplified) Signal Space and Decision Procedure 


Our problem is now considerably simplified* The irrelevant noise component has been filtered 
out* The residual signal q(r) can be represented in an -dimensional signal space* We proceed 
to determine the M decision regions R\ , /? 2 > * - ■, Rm in this space. The regions must be chosen 
to minimize the probability of error in making the decision. 

Suppose the received vector q = q. Then if the receiver decides m = mk , the conditional 
probability of making the correct decision, given that q = q, is 


P(C\q = q)^P(m k \q = q) (10,79) 

where ^(Clq = q) is the conditional probability of making the correct decision given q = q , 
and P(wjt|q = q) is the conditional probability that mk was transmitted given q — q. The 
unconditional probability P(C) is given by 

P(C)= f P(C\q = q) Pq (q)dq (10*80) 

Jq 

where the integration is performed over the entire region occupied by q. Note that this is 
an N-fold integration with respect to the variables qu qi* ■. - ? qN over the signal waveform 
duration. Also, because p<j(tf) > 0, this integral is maximum when P(C|q = q) is maximum* 
From Eq. (10,79) it now follows that if a decision m = mk is made, the error probability is 
minimized if the probability 


P(C) = 


l 


P(C\q = q)p q (q)dq 


is maximized. The probability P(/ftjtlq = q) is called the a posteriori probability of m k . This 
is because it represents the probability that m * was transmitted when q was being received* 
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The decision procedure to maximizing the probability of correct decision P(C), thereby 
minimizing the probability of error, is now clear. Once we receive q = q, we evaluate all M 
a posteriori probability functions {P{mj- |q = q)}. Then we make the decision in favor of that 
message for which the a posteriori probability is highest—that is, the receiver decides that 
m = m k if 


P(m k \q = q) > P(mjiq = q) for all j ^ k (10*81) 


Thus, the detector that minimizes the error probability is the maximum a posteriori 
probability (MAP) detector. 

We can use Bayes’ rule (Chapter 8) to determine the a posteriori probabilities. We have 


P(m k \q = q) = 


P(m k )p q (g\m k ) 

Pq(tf) 


(10.82) 


Hence, the receiver decides m — if the decision function 


Pq(4) 


1=1, 2_ M 


is maximum for / — k. 

Note that the denominator pq(tf) is common to all decision functions and is not effected 
by the decision* Hence, it may be ignored during the decision* Thus, the receiver sets m — m k 
if the decision function 


P(Mi)p q tq\mi) i = 1, 2,. *., M (10*83) 

is maximum for i = k * Thus, once q is obtained, we compute the decision function [Eq. (10*83)] 
for all messages m\, m 2 * ***, wm and decide that the message for which the function is 
maximum is the one most likely to have been sent. 

We now turn our attention to finding the decision functions. The a priori probability P(mj) 
represents the probability that the message m t will be transmitted. These probabilities must be 
known if the criterion discussed is to be used.* The term pqh/jmri represents the PDF of q 
when the transmitter sends s{t) = Under this condition, 

q = Si + n 


and 


n = q -Si 

The point Si is constant, and n is a random point* Obviously, q is a random point with the same 
distribution as n but centered at the points s;. 

Alternatively, the probability density at q = q (givenm = mi) is the same as the probability 
n — q — si. Hence [Eq. (10.78a)], 


-Pniq-Si) = 


1 r -\\<f-Si\\ 2 ?A r 

( J rtf )*** 2 


(10.84) 


* In case these probabilities are unknown, one must use other merit criteria, such as maximum likelihood or 
minimax, as will be discussed later 
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The decision function in Eq. (10.83) now becomes 


P(m) 

(7rAO' v/2 


(10.85) 


Note that the decision function is always nonnegative for all values of i. Hence, comparing these 
functions is equivalent to comparing their logarithms, because the logarithm is a monotone 
function for the positive argument. Hence, for convenience, the decision function will be 
chosen as the logarithm of Eq. (10.85). In addition, the factor (ttAO^ 2 is common for all i 
and can be left out. Hence, the decision function to maximize is 


3 ^ 

(«,■) - — r \\q ( 10 , 86 ) 

Note that 11 q — y,| I 2 is the square of the length of the vector q - Si. Hence, 

Ik -sJI 2 = <q-Si, q-$i> 

= Ikll 2 + IM! 2 “ 2<q, Si> (10.87) 

Hence, the decision function in Eq, (10.86) becomes (after multiplying throughout by M/2) 

y lnf(m/) - 1 (|k|| 2 + ||s,|| 2 - 2 <q, s;>) (10.88) 

Note that the term ||s,|| 2 is the square of the length of s, and represents E,-, the energy of signal 
s;(f). The terms A" In Pirn,) and E; are constants in the decision function. Let 

a t = - £)] (10.89) 

Now the decision function in Eq. (10.88) becomes 


at + <q, Si> - 


llgll 2 

2 


The term ||g|| 2 /2 is common to all M decision functions and can be omitted for the purpose 
of comparison. Thus, the new decision function b, is 


bi = at + <q, Sj> 


(10.90) 


We compute this function b : for i ~ 1,2, ..., N, and the receiver decides that m = mk if this 
function is the largest for i = k. If the signal q(t) is applied at the input terminals of a system 
whose impulse response is h(t), the output at t = T\j is given by 



q(x)h(T M - x)dx 


If we choose a filter matched to j/(r), that is, h(t) = Si{T M - /), 


KTm - t ) = $i(x) 
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and based on Parseval’s theorem, the output is 



q(z)Si(v)dT = <q, Si> 


Hence, <q, s/> is the output at t — 7'/,< of a filter matched to j,-(0 when q(t) is applied to 
its input. 

Actually, we do not have q{t). The incoming signal r(t) is given by 


r{t) — 5/(0 + n w ,(0 

= s/(t) + n(0+ n 0 (O 

q{ 0 irrelevant 

where no(f) is the (irrelevant) component of n w (t) orthogonal to the (V-dimensional signal 
space. Because n y (f) is orthogonal to this space, it is orthogonal to every signal in this space. 
Hence, it is orthogonal to the signal $,(/), and 


and 




no(OA7(0<* = 0 


<q, s-,> 


/ oc y*fX) 

dt + / no(t)Sj(!)dr 

-oo J- oo 

/ OO 

[q(0 + n o(f)Jv(0 dt 

■OO 

/ oo 

r(t)Si(t)dt 

■OO 


(10.91) 


Hence, it is immaterial whether we use q(t) or r(r) at the input* We thus apply the incoming 
signal r(t) to a parallel bank of matched filters, and the output of the filters is sampled at 
t = Tm* Then a constant ai is added to the iih filter output sample, and the resulting outputs 
are compared. The decision is made in favor of the signal for which this output is the largest 
The receiver implementation for this decision procedure is shown in Fig. 10*18a* Section 10.1 
has already established that a matched filter is equivalent to a correlator. One may therefore 
use correlators instead of matched filters* Such an arrangement is shown in Fig* 10* 18b, 

We have shown that in the presence of AWGN, the matched filter receiver is the optimum 
receiver when the merit criterion is minimum error probability. Note that the optimum system 
is found to be linear, although it was not constrained to be so. Therefore, for white Gaussian 
noise, the optimum receiver happens to be linear* The matched filter obtained in Secs. 10*1 
and 10.2, as well as the decision procedure are identical to those derived here. 

The optimum receiver can be implemented in another way. From Eq. (10.91), we have 


<q , S{> = <r , 


From Eq. (10.44), we can rewrite this as 


iV 

«l- Si> = 
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Figure 10.18 

Optimum M-ary 
receiver: 

(a} matched filter 

detector; 

fb) correlation 

detector. 



Sample at / = T M 
(a) 



Sample at t — T M 


(b) 


The term <q, s ; > is computed according to this equation by first generating r : and then comput¬ 
ing the sum of rjsy (remember that the s,y are known), as shown in Fig. 10.19a. The M correlator 
detectors in Fig. 10.18b can be replaced by N filters matched to tf> 11 ;), <p 2 (t), ..., as 

shown in Fig. 10.19b. These types of optimum receiver (Figs. 10.18 and 10.19) perform identi¬ 
cally, The choice will depend on the hardware cost. For example, ifN <M and signals {<Pj(t)} 
are easier to generate than \sj{t)}, then the design of Fig. 10.19 would be chosen. 


10.6.4 Decision Regions and Error Probability 

To compute the error probability of the optimum receiver, we must first determine deci¬ 
sion regions in the signal space. As mentioned earlier, the signal space is divided into M 
nonoverlapping, or disjoint, decision regions R\, R 2 , ..., corresponding to M messages. 
Tf q falls in the region R k , the decision is that was transmitted. The decision regions 
are chosen to minimize the probability of error in the receiver. In light of this geometrical 
representation, we shall now try to interpret how the optimum receiver sets these decision 
regions. 
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Figure 10.19 

Another form of 
optimum M -ary 
receiver: 

(a) correlator; 

(bj matched filter. 



Sample at r = 7’ jVJ 


(a) 



Sample at r = T M 

(b) 


The decision function is given by Eq. (10.86). The optimum receiver sets m = m* if the 
decision function 


MXnPimd-Wq-SiW 1 

is maximum for t — k , This equation defines the decision regions. 


Geometric Interpretation in Signal Space 

For simplicity, let us first consider the case of equiprobable messages, that is, Pirnf) — 1 jM 
for allIn this case, the first term in the decision function is the same for all i and, hence, 
can be dropped Thus, the receiver decides that m = m k if the term — \\q — s;|| 2 is largest 
(numerically the smallest) for i — k. Alternatively, this may be stated as follows: the receiver 
decides that m — m k if the decision function ||^ — s;jl 2 is minimum for i — k. Note that 
\\q — Si || is the distance of point q from point . Thus, the decision procedure in this case has a 
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simple interpretation in geometrical space. The decision is made in favor of that signal which 
is closest to q , the projection of r [the component of r(t)] in the signal space. 

This result is expected on qualitative grounds for Gaussian noise, because the Gaussian 
noise has a spherical symmetry. If, however, the messages are not equiprobable, we cannot go 
too far on purely qualitative grounds. Nevertheless, we can draw certain broad conclusions* 
If a particular message m, is more likely than the others, one will be safer in deciding more 
often in favor of m t than other messages. Hence, in such a case the decision regions will be 
biased, or weighted, in favor of m ;* This is shown by the appearance of the term In P(mi) in 
the decision function. To better understand this point, let us consider a two-dimensional signal 
space and two signals s\ and $ 2 , as shown in Fig. ) 0.20a, In this figure, the decision regions 
R\ and R 2 are shown for equiprobable messages; P{m\) = P{rri 2 ) = 0.5* The boundary of the 
decision region is the perpendicular bisector of the line joining points s\ and 52 * Note that any 
point on the boundary is equidistant from s\ and $ 2 -If q happens to fall on the boundary, we 
just “flip a coin” and decide whether to select m\ or m 2 . Figure 10.20b shows the case of two 
messages that are not equiprobable. To delineate the boundary of the decision regions, we use 
Eq. (10.86). The decision is m\ if 

\\q ™*ill 2 - A r lnP(mi)<||^ - S2 II 2 - M In P{m 2 ) 

Otherwise, the decision is m2. 

Note that | \q - s 1 11 and [ \q - s 2 \ | represent distances d[ and d 2 * the distance of q from s \ 
and s 2 > respectively* Thus, the decision is mi if 


4 


dl<Mln 


Pirn) 

P(m 2 ) 


The right-hand side of this inequality is a constant c: 


c = J\f\n 


P(m) 

P(m 2 ) 
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Thus, the decision rule is 


Decision^) = 


mi 

m2 

randomly mi or m 2 


if df 
if df 


■d\ <c 

■ d-2 > C 


if df — d\—c 


The boundary of the decision regions is given by df — d\ 
boundary is given by a straight line perpendicular to line s\- 
at a distance ft from s \, where 


= c. We now show that such a 
-S 2 and passing through s\ — s 2 


_c + d 2 _J\f I" P{m \) ~| d 
= 2d = 2d ]n [p{m2)\ + 2 


(10.92) 


where d is the distance between s 1 and $ 2 - To prove this, we redraw the pertinent part of 
Fig. 10.20b as Fig. 10.20c, from which it is evident that 


Hence, 


df ~a 2 4- /i 2 
d 2 — a 2 -f- (d — ia) 1 

df - d\ = ldix-d 2 = c 


Therefore, 

c + d 2 


This is the desired result. Thus, along the decision boundary df — d\ is constant and equal to c\ 
The boundaries of the decision regions for M >2 may be determined via similar argument. 
The decision regions for the case of three equiprobable two-dimensional signals are shown 
in Fig. 10.21. The boundaries of the decision regions are perpendicular bisectors of the lines 
joining the original transmitted signals. If the signals are not equiprobable, then the boundaries 
will be shifted away from the signals with larger probabilities of occurrence. 

For signals in Af-dimensional space, the decision regions will be Af-dimensional hyper¬ 
cones. If there are M messages mi, m 2 , ♦ ♦., m m with decision regions R\, /? 2 , ... , Rm, 
respectively, then FtCjmy), the probability of a correct decision when m, is transmitted, is 
given by 


P(C\m.i ) = P{q lies in Ri) 


and P(C), the probability of a correct decision, is given by 

M 

/>(C) = y>(mi)P(C|mi) 

/=i 

and P e \i, the probability of error, is given by 


(10.93) 


(10.94) 


P*M = 1 ~P(C) 


(10.95) 



Example 10.2 Binary data is transmitted by using polar signaling over an AWGN channel with noise PSD 
A/"/2. The two signals used are 

•fi(f) = -J2 (0 = VE<p(t) (10.96) 

The symbol probabilities P(m]) and Pimj) are unequal. Design the optimum receiver and 
determine the corresponding error probability. 


Figure 10.22 

Decision regions 
for the binary 
cose in this 
example. 



P{m ,) 


The two signals are represented graphically in Fig, 10,22a. If the energy of each signal 
is E , the distance of each signal from the origin is ^/£♦ The distance d between the two 
signals is 

d =ljE 
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The decision regions R\ and R 2 are shown in Fig. 10.22a. The distance \x is given by 
Eq. (10.92). Also, the conditional probability of correct decision 


P(C|m = mi) = P( noise vector originating atsi remains in R 1 ) 
— P(n > —/ 1 ) 



Similarly, 


P(C\m = m 2 )=\ -Q 


d - fi 
v^V/2 


) 


and the probability of correct decision is 
P(C) = P(m l ) 


e( * )1 

+ P(.m 2 ) 

l - Q | 

fd-nY 

WW/i 



WW2f j 


1 ~P0m)Q 


W- 


"-"te) 


and 


P t = \ - P{C) = P{m,)Q 




where 


and 


d = 2 VE 




4 YE P{mi) 

When P(mO = P(m 2 ) = 0.5, /x = YE = d/2, and Eq. (10.97a) reduces to 


(10.97a) 


(10.97b) 


(10,97c) 


Pe = Q 


2 E 


(10.97d) 


In this problem, because N = 1 and M = 2, the receiver in Fig. 10.19 is preferable 
to that in Fig. 10.18. For this case the receiver of the form in Fig. 10.19b reduces to that 
shown in Fig. 10.22b. The decision threshold d ! as seen from Fig. 10.22a is 


d' — YE — fx = 


-V , P(m 2 ) 

—— In- 

4 YE P(m) 
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| Note that d f is the decision threshold. Thus, in Fig. 10.22b, if the receiver output r > d\ 
H the decision is mi. Otherwise the decision is m 2 . 

g When P{m\) = P(m 2 ) — 0.5, the decision threshold is zero. This is precisely the 
^ result derived in See. 10.1 for polar signaling. 


10.6.5 Multiamplitude Signaling (PAM) 


We now consider Lhe M-ary generalization of the binary polar signaling, often known as pulse 
amplitude modulation (PAM). In the binary' case, we transmit two symbols, consisting of the 
pulses pit) and -p(t), where p(t) may be either a baseband pulse or a carrier modulated by a 
baseband pulse. In the multiamplitude (PAM) case, thcM symbols are transmitted by M pulses 
=b3 p(t), ±5 p{t), .... d=(M — 1)/?(;). Thus, Lo transmit M -ary digits per second, we 

are required to transmit R\f pulses per second of the form kp{t). Pulses are transmitted every 
Tm seconds, so that Tm — l /Rm ■ If E p is the energy of pulse p(t), then assuming that pulses 
ip(0, =b3/?( t), ±5p{t), ..., =t (M — 1 )p(t) are equally likely, the average pulse energy E p m 
is given by 


E p m = — [E p + 9E p + 25 E p + ■ ■ ■ + (M - 1 )%] 


M-2 


2 E n 


M 


E (M + 1)2 

A=l) 

M 2 ~ I 


~ Ep 


M » 1 


(10.98a) 

(10,98b) 


Recall that an M-ary symbol carries an information of log 2 M bits. Hence, the bit energy E), is 


Eb 




~ 1 


log 2 M 3 log 2 M 


(10.98c) 


Because the transmission bandwidth is independent of the pulse amplitude, the M-ary band- 
v. idth is the same as in the binary case for the gi\ en rate of pulses, yet it carries more information. 
This means that for a given information rate, the PAM bandwidth is less than that of the binary 
case by a factor of log, M. 

To calculate the error probability, we observe that because we are dealing with the same 
basic pulse p{t), the optimum M-ary receiver is a filter matched to p(t). When the input pulse 
is kp{t), the output at the sampling instant will be 


1 (T'm ) — kAp + Ti n (TM ) 

Note that A p = E p , the energy of pit), and that o 2 , the variance of n„(f). is j\'E p /2. Thus, the 
optimum receiver for the multiamplitude M-ary signaling case is identical to that of the polar 
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Figure 10,23 

(□] Conditional 
PDFs in PAM, 

(b) Error pro- 
babtlity in RAM. 


/W Hm ) 



(a) 



E b /K 

dB —** 


binary case (see Fig. 10.3 or 10.6a), The sampler has M possible outputs 

dr kA p + n^{7^) A = 1, 3, 5,..., M — 1 

that we wish to detect. The conditional PDFs p{r\m t ) are Gaussian with mean ±kA }} and 
variance as shown in Fig, 10.23a. Let P € m be the error probability detecting a symbol and 
P(6|m) be the error probability given that the symbol m is transmitted. 

To calculate P e M , we observe that the case of the two extreme symbols [represented by 
±{M — l)/>(0] is similar to the binary case because they have to guard against only one 
neighbor. As for the remaining symbols, they must guard against neighbors on both sides, and, 
hence, P(€\m) in this case is twice that of the extreme symbol. From Fig. 10.23a it is evident 
that P(€ [tti|) is 2(Ap/<7 n ) for the two extreme signals and is 2 Q(A p /a n ) for the remaining 
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(M — 2) symbols. Hence, 


M 

i= l 

M 

a7 

2(M - 1 ) 


1 


+ (M - 2)2G 




M 


-£> 


(£) 


For a matched filter receiver, (A p /a a ) 2 = 2E p /j\f, and 


(10.99a) 


(10.99b) 


(10.99c) 

(10.99d) 


Bit Error Rate (BER) 

It is somewhat unfair to compare M-ary signaling on the basis of P e u ■ the error probability 
of an M- ary symbol, which conveys the information of k — log 2 M bits. Because not all bits 
are wrong when an M-ary symbol is wrong, this weighs unfairly against larger M . For a fair 
comparison, we should compare various schemes in terms of their probability of bit error P& , 
rather than the probability of symbol error (symbol error rate). We now show that for 
multiamphtude signaling P b ^ P eM j log 2 M. 

Because the type of errors that predominate are those in which a symbol is mistaken for 
its immediate neighbors (see Fig, 10.23a), it would be logical to assign neighboring M-ary 
symbols, binary code words that differ in the least possible digits. The Gray code* is suitable 
for this purpose because adjacent binary combinations in this code differ only by one digit. 
Hence, an error in one M-ary symbol detection most likely will cause only one error in a 
group of log 2 M binary digits transmitted by the M-ary symbol. Hence, the bit error rate 
Pb — E eA//log 2 M , Figure 10.23b shows P e M as a function of EhfN for several values of 
M, Note that the relationship P^ — P e m / log 2 M, valid for PAM, is not necessarily valid for 
other schemes to the specific code structure. One must recompute the relationship between P& 
and P e M for each specific scheme. 


* Gray code can be constructed as follows. Construct an rc-digit natural binary code (NBC) corresponding to 2 n 
decimal numbers. If b\l >2 ■ ->b n is a code word in this code, then the corresponding Gray code word g\$2- 8n is 
obtained by the rule 


g] = h l 

8 k = h A > 2 

Thus for n = 3, the binary code 000, 001,010,011,100,101,110, 111 is transformed into the Gray code 000.001. 
011,010,110, 111, 101,100 
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Trade-off between Power and Bandwidth 

To maintain a given information rate, the pulse transmission rate in the M-ary case is reduced 
by the factor k = log 2 M . This means the bandwidth of the M-ary case is reduced by the 
same factor k ~ log 2 M, But to maintain the same P e w > Eq&- (10.99) show that the power 
transmitted per bit (which is proportional to E b ) increases roughly as 

M 2 / Iog 2 M = 2 2k /k 

On the other hand, if we maintain a given bandwidth, the information rate in the M-ary case is 
increased by the factor k — log 2 M. The transmitted power is equal to E h times the bit rate. 
Hence, an increased data rate also necessitates increased power by the factor 

(M 2 /log 2 M)(log 2 M) =2 2k 

Thus, the power increases exponentially with the increase in information rate by a factor of k . 
In high-powered radio systems, such a power increase may not be tolerable, Multiamplitude 
systems are attractive when bandwidth is very costly. Thus we can see how to trade power 
for bandwidth. Because the voice channels of a telephone network have a fixed bandwidth, 
multiamplitude (or multiphase, or a combination of both) signaling is a more attractive method 
of increasing the information rate. This is how voiceband computer modems achieve high data 
rate. 

All the results derived here apply to baseband as well as modulated digital systems with 
coherent detection. For noncoherent detection, similar relationships exist between the binary 
and M-ary systems.* 

10.6.6 A/l-ary QAM Analysis 

In M-ary QAM, the transmitted signal is represented by 


where 


tf;0) = a, 


2 / 2 

— cos ( 0 c t +bi J — sin co c t 


<P i (0 




± (Vm - i)d 
2 

, (Va/ - \)d 

± -T- 


( 10 . 100 ) 


It is easy to observe that the QAM signal space is two-dimensional with basis functions 
ip\ (0 and <p 2 (t). Instead of determining the optimum receiver and its error probability for an 
arbitrary QAM constellation, we illustrate the basic approach by analyzing the 16-point QAM 
configuration shown in Fig. 10.24a. We assume all signals to be equiprobable in an AWGN 
channel. 


* For the noncoherent case, the baseband pulses must be of the same polarity: for example, 
0 2p{t) . {M - [ 



Figure 10,24 

16-ary QAM. 
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Let us first calculate the error probability. The first quadrant of the signal space is repro¬ 
duced in Fig. 10,24b, Because all the signals are equiprobable, the decision region boundaries 
will be perpendicular bisectors joining various signals, as shown in Fig, 10,24b. 

From Fig. 10.24b it follows that 


P(C\m[) = P (noise vector originating at S] lies within R } ) 



For convenience, let us define 


Hence, 



( 10 . 101 ) 


P(C\ m] )=p 2 
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Using similar arguments, we have 


P(C\m 2 ) =P(C\m A ) = 



= p{2p - 1) 


1-2 Q 



and 


P(C\mi) = {2p - l) 1 


Because of the symmetry of the signals in all four quadrants, we get similar probabilities 
for the four signals in each quadrant. Hence, the probability of correct decision is 

16 

p(c) = y>(ci m/ )P( m/ ) 

i=i 

i £ 

= 16 E ' 1 * 1 

i=l 

= itV + *PQP “ 0 + 4 p( 2 p - 1 ) + 4 ( 2 p - l) 2 ] 


= ^[9/J 2 ~6p+ 1] 


( 10 . 102 ) 


and 

/’ e « = l-/ , (C) = ^ J p + i)(l-/») 

In practice, P e M -> 0 if SNR is high and, hence, P(C) -> 1. This means/? ^ land/?+^ ^ 1~ 
[see Eq. (10.102)], and 


PeM ~ 3(1 -P) = 3 Q (10.103) 

To express this in terms of the received power 5/, we determine E y the average energy of the 
signal set in Fig. 10.24. Because E*, the energy of S&, is the square of the distance of from 
the origin, 


Ei = 
E 2 = 




Similarly, 


and £4 = -d 2 
2 
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Hence, 


E 


: :d l + -d 2 



2 


and d 2 = 0.4E. Moreover, for M = 16, each symbol carries the information of log 2 16 = 4 
bits. Hence, the energy per bit Eh is 


and 


Hence, for large &V-V 



E b _ E 5d 2 
Jf ~ 4V " 



(10.104) 


A comparison of this with binary PSK [Eq. 10.33)] shows that 16-point QAM requires almost 
2.5 times as much power as does binary PSK; but the rate of transmission is increased by a 
factor of log 2 M — A. This comparison does not take into account the fact that P b , the BER, is 
somewhat smaller than P e ^ . 

In terms of receiver implementation, because N = 2 and M = 16, the receiver in 
Fig. 10T9 is preferable. Such a receiver is shown in Fig. 10.24c. Note that because all signals 
are equiprobable, 


Ei 

° i = -T 

PSK is a special case of QAM with all signal points lying on a circle. Hence, the same ana¬ 
lytical approach applies. However, the analysis may be more convenient if a polar coordinate 
is selected. We use the following example to illustrate the two different approaches. 


Example 10.3 MPSK 

Determine the error probability of the optimum receiver for equiprobable MPSK signals, each 
with energy E. 

% Figure 10.25a shows the MPSK signal configuration forM = 8. Because all the signals are 
| equiprobable, the decision regions are conical, as shown. The message m\ is transmitted 
I by a signal s\(t) represented by the vector s\ — ($i, 0). If the projection in the signal 
f space of the received signal r is q = (q\, q 2 ), and the noise is n = (n\, n 2 \ then 


q = (ji + n ( , 112) = ( Ve 4 - m, 112 ) 
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Figure 10.25 

MP5K signals. 
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& Hence, 

I 


£ 


P(C\m x ) 


nJ\f 

1 


JV Jo \J- ?1 tan (31/A/) / 

[ Q \—jWn~)\ e 1 d “' 


1 


Changing the variable to^ = *J2fNq\, we get 

P{C\m\) ——= f [l — 2Q (*tan 77 )]^ ^ V 2E ^) i- (10.106a) 

V 2 tt Jo L V M/J 

Using the fact that E/ } > the energy per bit, is £/log 2 M, we have 

P(C|mi) = -2= [l - 2 Q (j;tan ^)] ^ 

( 10 . 106 b) 

The inte gratio n can also be performed in cylindrical coordinates using the transformation 
q\ — P\j Af /2 cos $ and q 2 — pjAf f 2 sin 0. The limits on p are (0, oc) and those on 0 
are —tt/M to jt/M* Hence, 


'71 fM 


r* 0 


dB 

/ P? 

-tt/M 


Jo 

•n/M 


poo 


dB 

/ pe 

-jt/M 


Jo 


?cos 0+2£/jV)/2 


dp 


(10H 07a) 


1 pii/M f<X> _ 

= _ / dQ l pe -[p 1 -2p^/{2\^ 2 ^){E b (\ r )^ 9+(2\og 2 M)(E b /Ar))/2 dp 

(10.107b) 

Because of the symmetry of the signal configuration, P{C\m[) is the same for all i. Hence, 

P{C)=P{C\m x ) 

and 

P eM - 1 -P(C\mi) 


On the other hand, because j,-(f) = +J2E/Tm cos (oj 0 t-\~0i), where oj 0 = 2n/T^ y (?/ — 
2tti/M, the optimum receiver turns out to be just a phase detector similar to that shown 
in Fig. 10*24 (Prob. 10*6-10). Based on this observation, an alternative expression of P eM 
can also be found* Since p®(6) of the phase 0 of a sinusoid plus a bandpass Gaussian 
noise is found in Eq* (9.86d), 


PeM 


/ njM 
-jt/M 


Ps(9)d9 


The PDF pe(@) in Eq* (9*86d) involves A (the sinusoid amplitude) and a* (the noise 
variance). Assuming matched filtering and white noise [see Eq. (lO.lla)J, 


A 2 2 E p 2E)? log 2 M 

a 2 J\f Af 
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Figure 10.26 

Error probability 
of MPSK. 


Hence, 




1 p7l/M 

P'M = 1 - T~ / 

J-n/M 

, / :2Ei, log, M Y 

x 1 ~ g (V m' C0i6 j 


1 + ;to £jlog ; M cos »ioi,m,a'> 

V A 


dB 


(10,108) 


Figure 10.26 shows the plot of P. \; as a function of A". For E : , .V 1 (weak noise) 
A and M >> 2, Eq, (10,108) can be approximated by 7 


(10,109a) 


(10.1096) 



°eM - 2 Q ^ 


-2 Q 


12E h logi M . n 

, - — = — sin — 

V A' M 


I 2jt 2 E( > log 2 M 

M~AV 


M -16 
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10.7 GENERAL EXPRESSION FOR ERROR 
PROBABILITY OF OPTIMUM RECEIVERS 

Thus far we have considered rather simple schemes in w hich the decision regions can be found 
easily. The method of computing error probabilities from knowledge of decision regions has 
also been discussed. When the number of signal space dimensions grows, it becomes harder 
to visualize the decision regions graphically, and as a result the method loses its power. We 
now develop an analytical expression for computing error probability for a general A/-ary 
scheme. 

From the structure of the optimum receiver in Fig. 10.18, we observe that if m\ is 
transmitted, then the correct decision will be made only if 

b] > b2, b^, .,,, b.vf 


In other w ords, 


P(C]mi) = probability (bi > b 2 , b^ ..., b,w |mi) (10,110) 

If mi is transmitted, then (Fig. 10.18) 

bjt = / U’i(0 + n{t)\s P '(t)dt + a k (10,111) 

Jo 

Let 

fT.w 

pit = / i, j= L 2, ..., M (10.112) 

Jo 

w'here the py are known as cross-correlations. Thus (if mi is transmitted), 

r Tu 

b,t = Pik + / ti(t)s k (t)dt + ctk 

Ji) 

V 

= P\k+a k + 

j= i 

where n, is the component of n(r) along <pj(t). Note that p\k + &k is a constant, and variables 
n 7 (j = 1, 2, .. . „ N) are independent jointly Gaussian variables, each with zero mean and a 
variance of jV/2. Thus, variables b* are a linear combination of jointly Gaussian variables. It 
follows that the variables bi , b 2l .. *, b^ are also jointly Gaussian, The probability of making 
a correct decision when mi is transmitted can be computed from Bq + (10T10), Note that bi 
can lie anywhere in the range (—oe f oo). More precisely, ifp(fri, b 2 > ■ < * is the joint 

conditional PDF of bi, b 2 , .. ., b,v/, then Eq. (10,110) can be expressed as 

/ oo phi 

I ■ / p(bi, b 2 * " bM\rni)db\, db 2 > dbM (10T14a) 

-OO J—O O j —30 


(10.113a) 

(10.113b) 
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where the limits of integration of b\ are (- 00 , 00 ), and for the remaining variables the limits 
are (- 00 , b 1 ). Thus, 

/ OO ph] rb\ 

db\ I db2---j p{b\,b2 . bM\mi)db M (10.114b) 

-OC J — 00 J —Oo 

Similarly, P(C\m 2 ), ., ♦, P(C\mm) can be computed, and 

M 

;=i 

and 

PeM = 1 - P(C) 


Example 10,4 Orthogonal Signal Set 

Tn this set all M equal-energy signals s\(t), ..., are mutually orthogonal. As an 

example, a signal set for M — 3 is shown in Fig. 10.27. 



The orthogonal set {^(f)} is characterized by 


fo 

<sj, s k > = l 


Hence, 


Pij = <*i. 


S;> = 


0 

E 


j 

j = k 


i #7 

i=J 


(10*115) 


(10.116) 
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K Further, we shall assume all signals to be equiprobable. This yields 


1 




M In 


( m ) 


E k 


$ 

ft 


| 


= - - (jV In M + E) 

2 

where E k = E is the energy of each signal. Note that a k is the same for every sig¬ 
nal. Because the constants a k enter the expression only for the sake of comparison 
(Fig. 10.19b), when they are the same, they can be ignored (by settings = 0). Also for an 
orthogonal set, 


s k {t) - V£ <pk(t") 


Therefore, 


S M 


v 

0 


Ve 


k =j 
k /J 


(10.117) 


(10.118) 


Hence, from Eqs. (10.113b), (10.116), and (10.118), we have (when mi is transmitted) 


b* 


E + \/£ n| 
VE n* 


k = 1 

k = 2, 3. M 


(10.119) 


Note that ni, nj, ..., n,w are independent Gaussian variables, each with zero mean and 
variance M/2. Variables b; that are of the form (an* -1- 6) are also independent Gaussian 
variables. Equation (10,119) shows that the variable b] has the mean E and variance 
(ME) 2 (M/2) = ME/2. Hence, 


Pb i<*i) - 


Pb t (*t) = 


1 


■s i n ME 


-(b\-Er iME 


1 


■sJtiME 


e 


bj /jVE 


k = 2, 3. M 


Because bi, b 2 , ..., fcjw are independent, the joint probability density is the product of 
the individual densities: 


p(b\, b ly b\t\mi) 


1 




-{h\ -ty 




fr ( — _ e - b i^' E \ 

lJ 2 \^ME ) 


and 


I 

$ 

1 


1 = £ db ' [ '"“ l ~ EM ‘ ] * n (/1 db ) 


= -jL= r db x [e- (h ^ 2 ^ E ] X ( f bl ~L=e- xl / ArE d^\ 

VttA'E J -oo o ■>/ ttAi E J 


Ai-1 


Mr ME J -oo 

1 

VjtA fE 


L= r [l-fif ^ Y1 X e-^-EfiME db (10 .i20a) 

meJ- ocL VvWetvJ 
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Changing the variable so that h\/^Zj\fE/2 — y f and recognizing that E/M = (log 2 M) 
Eb/M, we obtain 


P{C\m\) = -2= J e ( v ~V^) /: [1 - dy (10.120b) 

= -2= j /2 f] _ ^ (10.120c) 

Note that this signal set is geometrically symmetrical; that is, every signal has the same 
relationship with other signals in the set. As a result, 


P(C\m } ) = P(C\m 2 ) = • ■ ■ = P{C\m M ) 


Hence, 


P(C)=P(C\mO 


and 


PeM = 1 

= 1 


P(C) 

1 f 00 — [>— 1o 62 A / " /2 

V^/-oo £ ' 


[1 -e(>’)] ,W_1 dy 


(10.120d) 


In Fig. 10.28 the result of P c u vs. V is computed and plotted. This plot shows an 
interesting behavior for the case of M = 00 . As M increases, the performance improves 
but at the expense of larger bandwidth. Hence, this is a typical case of trading bandwidth 
for performance. 


Multitone Signaling (MFSK) 

In the case of multitone signaling, M symbols are transmitted by M orthogonal pulses of 
frequencies co\, ..., each of duration Thus, the M transmitted pulses are of 

the form 

rz f 2n (N -h k) 

V2 P (f)cos {Dfrt COfc = - 

Tm 


The receiver (Fig. 10.29) is a simple extension of the binary receiver. The incoming pulse is 
multiplied by the corresponding references v^cos oj T t (i = 1, 2, ..., M). The filter H(f ) is 
matched to the baseband pulse p f (t) such that 

The same result is obtained if in the ith bank, instead of using a multiplier and Hif ), we use 
a filter matched to the RF pulse p f (t) cos The M bank outputs sampled at t = Tm are 
bi, b2, ..., . 
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The integral appearing on the right-hand side of Eq. (10.12i) is computed and plotted in 
Fig. 10.28 { P e M vs. E„/X). This plot shows an interesting behavior for the case of M = oo. 
By properly taking the limit of P eM in Eq. (10.121) as M -s- oo, it can be shown that 5 


lim 

M-*c 



Eb/X < \og e 2 
Eb/X > log*. 2 


Because the signal power 5/ = E b Rh. where Rh is the bit rate, it follows that for error-free 
communication. 


Eb 

X 


> log e 2 = 


1.44 


2L > _L 

XR b - 1.44 


Hence, 


Si 

R b < 1,44-^ bit/s (10.122) 

This shows that M-ary orthogonal signaling can transmit error-free data at a rate of up to 
1.44 S,/A' bit/s as M -> oo {see Fig t 10*28), 


Bit Error Rate (BER) of Orthogonal Signaling 

For PAM and MPSK, we have shown that, by applying the Gray code, Pb = P eM /log 2 M + 
This result is not valid for MFSK because the errors that predominate in PAM and MPSK, are 
those in which a symbol is mistaken for its immediate neighbor. We can use the Gray code to 
assign the adjacent symbols codes that differ in just one digit. In MFSK, on the other hand, a 
symbol is equally likely to be mistaken for any of the remaining M - 1 symbols. Hence, 
the probability of mistaking one particular M-ary symbol for another, is equally likely, 


^<0 = 


PeM 
M - 1 


PeM 
2 k - 1 


If an M -ary symbol differs by 1 bit from N\ number of symbols, and differs by 2 bits from N 2 
number of symbols, and so on, then N € , the average number of bits in error in reception of an 
M-ary symbol, is 


N f = £\m i P(€) 


n=l 

k 




M=1 


PeM 
2 k - 1 


PeM ( k \ 

= 2^T?"U 


= kl k 


n=l 

-1 PeM 
2 k - 1 


This is an average number of bits in error in a sequence of k bits (one M-ary symbol). 
Consequently, the BER, Pb, is this figure divided by k. 


p b ~ _ 1 PeM ^ ~Y~ k ^ 1 
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From this discussion, one very interesting fact emerges; whenever the optimum receiver 
is used, the error probability does not depend on specific signal waveforms; it depends only 
on their geometrical configuration in the signal space. 


Bandwidth and Power Trade-offs of M-ary Orthogonal Signals 

As illustrated by Landau and Poliak, 3 the dimensionality of a signal is 2BT M + L where Tm is 
the signal duration and B is its essential bandwidth. It follows that for an AT-dimensional signal 
space (N < M), the bandwidth is B = (N — Y)/2T^j. Thus, reducing the dimensionality AT 
reduces the bandwidth. 

We can verify that N -dimensional signals can be transmitted over (N - 1 ) /2Tm Hz by 
constructing a specific signal set Let us choose the following orthonormal signals; 


<Po(0 = 

<P2(0 = 

n W = 

<p 4 (t) = 







sm oj 0 t 


COS &) 0 t 

2jz 

sin 2 o) a t 

£ 

VI 

VI 

O 

cos 2 co 0 t 




(10.123) 


These k + 1 orthogonal pulses have a total bandwidth of (k/2)(co 0 /27t) = kj2Tu Hz. Hence, 
when k +■ 1 = N, the bandwidth* is (N - \)/2Tm- Thus, N =2T jtf fi+ L 

To attain a given error probability, there is a trade-off between the average energy of the 
signal set and its bandwidth* If we reduce the signal space dimensionality, the transmission 
bandwidth is reduced. But the distances among signals are now smaller, because of the reduced 
dimensionality* This will increase P c m . Hence, to maintain a given low P?m> we must now 
move the signals farther apart; that is, we must increase energy. Thus, the cost of reduced 
bandwidth is paid in terms of increased energy. The trade-off between SNR and bandwidth 
can also be described from the perspective of information theory (Sec. 13.6). 

M-ary signaling provides us with additional means of exchanging, or trading, the transmis¬ 
sion rate, transmission bandwidth, and transmitted power. It provides us flexibility in designing 
a proper communication system. Thus, for a given rate of transmission, we can trade the trans¬ 
mission bandwidth for transmitted power* We can also increase the information rate by a 


* Here we are ignoring the band spreading at the edge. This spread is about 1 /Tm Hz. The actual bandwidth exceeds 
(N - J)/2 Tm by this amount. 
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factor of k(k = log 2 M) by paying a suitable price in terms of the transmission bandwidth 
or the transmitted power. Figure 10.28 showed that in multitone signaling the transmitted 
power decreases with M. However, the transmission bandwidth increases linearly with M, or 
exponentially with the rate increase factor k (M = 2 k ), Thus, multitone signaling is radically 
different from multiamplitude or multiphase signaling. In the latter, the bandwidth is indepen¬ 
dent of M, but the transmitted power increases as M 2 / log 2 M = l 2k jk\ that is, the power 
increases exponentially with the information rate increase factor k. Thus, in the multitone case, 
the bandwidth increases exponentially with k, and in the multi amplitude or multiphase case, 
the power increases exponentially with it. 

The practical implication is that we should use multi amplitude or multiphase signaling if 
the bandwidth is at a premium (as in telephone lines) and multitone signaling when power is 
at a premium (as in space communication). A compromise exists between these two extremes. 
Let us investigate the possibility of increasing the information rate by a factor k simply by 
increasing the number of binary pulses transmitted by a factor it. In this case, the transmitted 
power increases linearly with k. Also because the bandwidth is proportional to the pulse rate, 
the transmission bandwidth increases linearly with k. Thus, in this case, we can increase 
the information rate by a factor of k by increasing both the transmission bandwidth and the 
transmitted power linearly with fc, thus avoiding the phantom of the exponential increase that 
was required in the M-ary system. But here we must increase both the bandwidth and the 
power, whereas formerly the increase in information rate can be achieved by increasing either 
the bandwidth or the power. We have thus a great flexibility in trading various parameters and 
thus in our ability to match our resources to our requirements. 


Example 10.5 Wearerequired to transmit 2.08 x 10 6 binary digits persecond with F b < 10 -6 . Three possible 
schemes are considered; 

(a) Binary 

(b) 16-ary ASK 

(c) 16-ary PSK 

The channel noise PSD is 5 n (w) = 10 -s > Determine the transmission bandwidth and the signal 
power required at the receiver input in each case. 

(a) Binary: We shall consider polar signaling (the most efficient scheme), 

/w. = Kr» = e(^ 

;; This yields E b /M — 11.35. The signal power Si = E b R b . Hence, 

■; Si = 11.35A r R b = 11.35(2 x 1(T*)(2.08 x 10 6 ) =0.47W 

Assuming raised-cosine baseband pulses of roll-off factor 1, the bandwidth Br is 

B r = R b = 2*08 MHz 

(b) 16-ary ASK: Because each 16-ary symbol carries the information equivalent 
of log 2 16 = 4 binary digits, we need transmit only Rm — (2.08 x 10 6 )/4 = 0.52 x 10 6 
16-ary pulses per second. This requires a bandwidth Bj of 520 kHz for baseband pulses 
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and 1.04 MHz for modulated pulses (assuming raised-cosine pulses). Also, 

_6 _ ?eM 


P b = 10" 


log 2 16 


Therefore, 


P e M =4x10 


-(¥)■ 


16Eb log 2 16 
NiM 1 ~ 1 ) 


For M = 16, this yields E b = 0.499 x 10 . if the M-ary pulse rate is , then 

Si = E p mRm = Et> log 2 M ■ R m 

— 0.499 x 10“ 5 x 4 x (0.52 x 10 6 ) = 9,34W 

(c) 16-ary PSK: We need transmit only Rm = 0*52 x 10 6 pulses per second. For 
baseband pulses, this will require a bandwidth of 520 kHz. But PSK is a modulated signal, 
and the required bandwidth is 2(0.52 x 10 6 ) = 1.04 MHz. Also, 


P,M =4P* =4 x 10"*-2Q 


This yields = 137.8 x 10 8 and 


t 


lit 2 Eh log : 16 
256.V 


Si — Eh logi 167? v/ 

= (137.8 x 10 _s ) x 4 x (0.52 x 10 6 ) =2.86W 


10.8 EQUIVALENT SIGNAL SETS 


The computation of error probabilities is greatly facilitated by the translation and rotation of 
coordinate axes. We now show that such operations are permissible. 

Consider a signal set with its corresponding decision regions, as shown in Fig. 10.30a. 
The conditional probability P(C|mj) is the probability that the noise vector drawn fromsi lies 
within 7?i. Note that this probability does not depend on the origin of the coordinate system. 
We may translate the coordinate system any way we wish. This is equivalent to translating the 
signal set and the corresponding decision regions. Thus, the P(C|mj) for the translated system 
shown in Fig. 10.30b is identical to that of the system in Fig. 10.30a. 

In the case of Gaussian noise, we make another important observation. The rotation of the 
coordinate system does not affect the error probability because the noise-vector probability 
density has spherical symmetry. To show this, we shall consider Fig. 10.30c, which shows the 
signal set in Fig. 10.30a translated and rotated. Note that a rotation of the coordinate system 
is equivalent to a rotation of the signal set in the opposite sense. Here for convenience we 
rotate the signal set instead of the coordinate system. It can be seen that the probability that the 
noise vector n drawn from s\ lies in R\ is the same in Fig. 10.30a and c, since this probability 
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Figure 10.30 

Translation and 
rotation of 
coordinate axes. 


9l 


*2 




*L 


s ] 






(a) 



is given by the integral of the noise probability density p n (n) over the region R\. Because 
Pnfo) has a spherical symmetry for Gaussian noise, the probability will remain unaffected by a 
rotation of the region R \. Clearly, for additive Gaussian channel noise, translation and rotation 
of the coordinate system (or translation and rotation of the signal set) do not affect the error 
probability. Note that when we rotate or translate a set of signals, the resulting set represents 
an entirely different set of signals. Yet the error probabilities of the two sets are identical. Such 
sets are called equivalent sets. 

The following example demonstrates the utility of translation and rotation of a signal set 
in the computation of error probability. 


Example 10.6 A quaternary PSK(QPSK) signal set is shown in Fig. 103 la: 


s l — “S2 — <P i 
S 3 = — $4 = <p2 
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Assuming all symbols to be equiprobable, determine P e M for an AWGN channel with noise 
PSD tf}2. 


Figure 10.31 

Analysis of 
QPSiC 



This problem has already been solved in Example 10.4 for a general value of M , Here we 
shall solve it for M — 4 to demonstrate the power of the rotation of axes. 

Because all the symbols are equiprobable, the decision region boundaries will be 
perpendicular bisectors of lines joining various signal points (Fig. 10.31a). Now 

PfCImi) = Pfnoise vector originating at sj remains in /?i) (10.124) 

This can be found by integrating the joint PDF of components ni and n 2 (originating at 
S ]) over the region /?i. This double integral can be found by using suitable limits, as in 
Eq. (10.106). The problem is greatly simplified, however, if we rotate the signal set by 
45°, as shown in Fig. 10.31b. The decision regions are rectangular, and if n* and n 2 are 
noise components along and <p 2 , then Eq. (10.124) can be expressed as 

P(C|mi) =P^ai> -^1, n 2 > - 

=/> ( ni * -/!)=•-/!) 
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] ~q\ 


2 

(10.125a) 

l - Q \ 

f (2Eb\ 
^ * ) 

l 2 

(10.125b) 


10.8.1 Minimum Energy Signal Set 

As noted earlier, an infinite number of possible equivalent signal sets exist. Because signal 
energy depends on its distance from the origin, however, equivalent sets do not necessarily 
have the same average energy. Thus, among the infinite possible equivalent signal sets, the 
one in which the signals are closest to the origin, has the minimum average signal energy (or 
transmitted power). 

Let mi, m 2 , .., , rrtM be M messages with waveforms (r), 52(0, ■ ■ -, (f), repre¬ 

sented, respectively, by points s 1 , § 2 , . .., sm in the signal space. The mean energy of these 
signals is E, given by 


M 

i=l 

Translation of this signal set is equivalent to subtracting some vector a from each signal. We 
now use this simple operation to yield a minimum mean energy set. We basically wish to find 
the vectors such that the new mean energy 

M 

E' = -a|| 2 (10.126) 

1=1 

is minimum. We can show that a must be the center of gravity of M points located at 
s\, 52 , ..., with masses P(m\), P(ni 2 ), ..., P(m^) y respectively, 

M 

a = £P(mi)Si = S? (10.127) 

i-l 

To prove this, suppose the mean energy is minimum for some translation b ♦ Then 
M 

e’ = Y J p{ ~ m ^~ b 11 2 

i=i 

M 

= ^F(m i )||(s i -a) + (a-6)|[ 2 

i =1 

M MM 

~ ^F(m/)||®i -a|| 2 + 2<(a -b), ^F(m;)(s; -a)> + ^F(m/)||fl - A|| 2 
/=] 1=1 i=l 
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Observe that the second term in the foregoing expression vanishes according to Eq. (10.127) 
because 

M M M 

- a) = ^ P(m;).Si - a ^ P(mJ 

i= 1 (=1 i-1 

— a - a ■ 1=0 

Hence, 


M M 

E’ = ||si -a|| 2 + ^P(m ; ) ||a - A|| 2 

1 = 1 i= I 

This is minimum when b = a. Note that the rotation of the coordinates docs not change the 
energy, and, hence, there is no need to rotate the signal set to minimize the energy after the 
translation. 


Example 1 0.7 For the binary orthogonal signal set of Fig. 10.32a, determine the minimum energy eqidvalcnt 
signal set. 


Figure 10,32 

Equivalent signal 
sets. 
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(b) 
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s' 2 (t) 

bV: 


(c) 


(d) 


>3 

i 

# 

% 

n 


The minimum energy set for this case is shown in Fig. 10.32b. The origin lies at the center 
of gravity of the signals. We have also rotated the signals for convenience. The distances 
and k 2 must be such that 

k\ H- k2 = d 


and 


k\P(mi) = k 2 P{m 2 ) 
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Solution of these two equations yields 

k\ — P(mi)d 


and 

k 2 = P(m } )d 

Both signal sets (Fig. 10.32ajmd b) have the same error probability, but the latter has 
a smaller mean energy. If E and E* are the respective mean energies of the two sets, then 

— d 2 d 2 d 2 

E = P(mi)~ +P{m 2 )— = — 

and 


E 1 = P( mi )kf + P(m 2 )ki 

= P{m\)P 2 (m2)d 2 + P(m 2 )P 2 (mi)d 2 
- P(m\)P(ni 2 )d 2 

Note that for P(m i) 4- P{m 2 ) = 1, the product P{m\)P{m 2 ) is maximum when P(m i) = 
P{m 2 ) — 1 in which case 


1 

P(m[)P(m 2 ) = - 
4 


and consequently 


Therefore, 




and for the case of equiprobable signals, 


E f = 


E 

2 


In this case, 


d 

h=k 2 = - 
- d 1 

E — — and 
2 


E f = 


d 2 

~4 


The signals in Fig. 10.32b are called antipodal signals when £] = k 2 . The error probability 
of the signal set in Fig. 10.32a (and 10 + 32b) is equal to that in Fig. 10.22a andean be found 
from Eq. (10.97a). 
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As a concrete example, let us choose the basis signals as sinusoids of frequency 
&a = 2tz/T m : 



0 < t c Tm 


Hence, 


d d 

s\(t) = = ™ 7 ==sin a> 0 t 

V 2 V i M 


d d 

n{t) = —=<p 2(0 = -= sin 2 a> 0 t 

V 2 y/lM 


0<1 <T m 


The signals $i(f) and *2(0 are shown in Fig, 10.32c, and the geometrical representation 
is shown in Fig. 10.32a. Both signals are located at a distance d/42 from the origin, and 
the distance between the signals is d , 

The minimum energy signals s\ (t) and s f 2 (t) for this set are given by 


s\(t) = J—P(m 2 )d sin a> 0 t 


= -J—sin aj 0 t 


0 <t <Tm 


These signals are sketched in Fig, 10.32d. 


10.8.2 Simplex Signal Set 

A minimum energy equivalent set of an equiprobable orthogonal set is called a simplex, 
or transorthogonal, signal set A simplex set can be derived as an equivalent set from the 
orthogonal set in Eq. (10.115). 

To obtain the minimum energy set, the origin should be shifted to the center of gravity of the 
signal set. For the two-dimensional case (Fig, 10,33a), the simplex set is shown in Fig. 10,33c, 
and for the three-dimensional case (Fig, 10,33b), the simplex set is shown in Fig, 10,33d, Note 
that the dimensionality of the simplex signal set is less than that of the orthogonal set by 1. 
This is true in general for any value of M. It can be shown that the simplex signal set is the 
optimum (minimum error probability) for the case of equiprobable signals embedded in white 
Gaussian noise when energy is constrained, 4,8 
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Figure 10.33 

Simplex signals. 



We can calculate the mean energy of the simplex set by noting that it is obtained by 
translating the orthogonal set by a vectors given in Eq. (10.127), 


a 


1 

M 


M 




/- \ 


For orthogonal signals, 


Sj = *Je <p ■ 


Therefore, 


a 




M 


where E is the energy of each signal in the orthogonal set and <p t is the unit vector along the 
ith coordinate axis. The signals in the simplex set are given by 


Sfe — s k a 


fp M 

-#i>, 


( = 1 


The energy FJ of signal is given by |s' [ 2 , 


j-rf t ! 

E = <s k ,s k > 


(10.128) 


(10.129) 
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Substituting Eq. (10.128) into Eq, (10.129) and observing that the set ip l is orthonormal, we 
have 


E f — E — 


£ 

M 


= E 



(10.130) 


Hence, for the same performance (error probability), the mean energy of the simplex signal set 
is 1 1/M times that ot the orthogonal signal set. For M 1, the difference is not significant. 

For this reason and because of the simplicity in generating, orthogonal signals, rather than 
simplex signals are used in practice whenever M exceeds 4 or 5. 

In Sec. 13.6, we shall show that in the limit as M oo, the orthogonal (as well as the 
simplex) signals attain the upper bound of performance predicted by Shannon’s theorem. 


10.9 NONWHITE (COLORED) CHANNEL NOISE 

Thus far we have restricted our analysis exclusively to white Gaussian channel noise. Our 
analysis can he extended to nonwhite, or colored, Gaussian channel noise. To proceed, 
the Karhunen-Loeve expansion of Eq. (10.37) must he solved for the colored noise with 
autocorrelation function R x (t, t\). This general solution, however, can be quite complex to 
implement. 4 

Fortunately, for a large class of colored Gaussian noises, the power spectral density S n (f) is 
nonzero within the message signal bandwidth £f. This property provides an effective alternative. 
We use a noise™whitening filter Hif ) at the input of the receiver, where 

H(f) = 

The delay is introduced to ensure that the whitening filter is causal (realizable). 

Consider a signal set (^(r)} and achannel noise n(t) that is not white [£ n (/) is not constant]. 
At the input of the receiver, we use a noise-whitening filter H(j ) that transforms the colored 
noise into white noise (Fig. 10.34). But it also alters the signal set {^(r)} to where 

■*1(0 = Si(t) *h(t) 


We now have a new signal set (j'(/)} mixed with white Gaussian noise, for which the optimum 
receiver and the corresponding error probability can be determined by the method discussed 
earlier. 


Figure 10.34 

Optimum M-ary 
receiver for 
non white 
channel noise. 


n(r) 
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10.10 OTHER USEFUL PERFORMANCE CRITERIA 

The optimum receiver uses the decision strategy that makes the best possible use of the observed 
data and any a priori information available. The strategy will also depend on the weights 
assigned to various types of error. In this chapter we have thus far assumed that all errors have 
equal weight (or equal cost). This assumption is not justified in all cases, and we may therefore 
have to alter the decision rule. 

Generalized Bayes Receiver 

If we are given a priori probabilities and the cost functions of errors of various types, the 
receiver that minimizes the average cost of decision is called the Bayes receiver, and the 
decision rule is Bayes ? decision rule. Note that the receiver that has been discussed so far 
is the Bayes receiver under the condition that all errors have equal cost (equal weight). To 
generalize this rule, let 

Ckj — cost of deciding that m = when m } was transmitted (10.131) 

and, as usual, 

= conditional probability that m[ was transmitted when q is received 

If q is received, then the probability that mj was transmitted is P(mj \q) for ally = 1, 2 , ..., M . 
Hence, the average cost of the decision m = is ftt, given by 

Pk = CuP{m\\q) + C k2 P(m 2 \q) + • ■ ■ + Om (m*/ \q) 

M 

= '£ l C kj P(m j \q) (10.132) 

;=i 

Thus, if q is received, the optimum receiver decides that m = if 

ft < ft for all i ^ k 


or 


M m 

Y^CkjPimjlq) < for all i?k (10.133) 

7=1 J=i 

Use of Bayes’ mixed rule in Eq. (10.133) yields 

M M 

Y^C k jP(mj)Pn{q\mj) < 'Y^C ij P{mj)p <i {q\mj) for all i # k (10.134) 

7=1 7=1 

Note that C kk is the cost of setting m = m k when m k is transmitted. This cost is generally zero. 
If we assign equal weight to all other errors, then 


Ckj = 


0 

1 


k=j 

k 


(10.135) 
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and the decision rule in Eq. (10.134) reduces to the rule in Eq. (10.83), as expected. The 
generalized Bayes receiver for M =2, assuming Cj i = C 22 = 0, sets m = mi if 

C]2P(m2)Pq(q\m 2 ) < C2iP(mi)/j q (^|mi) 

Otherwise, the receiver decides that m = m 2 . 


Maximum Likelihood Receiver 

The strategy used in the Bayes receiver discussed in the preceding subsection is general, except 

that it can be implemented only when the a priori probabilities P(mi), P(mr) . P(m M ) are 

known. Frequently this information is not available. Under these conditions various possibili¬ 
ties exist, depending on the assumptions made. When, for example, there is no reason to expect 
any one signal to be more likely than any other, we may assign equal probabilities to all the 
messages: 


P{m\) = P(m 2 ) = ■ ■ ■ = P(m M ) = T 

M 


Bayes 1 rule [Eq, (10-83)] in this case becomes: set m ~ if 

Pq(q\m) > Pq(q\mi) for all l k (10.136) 

Observe that/?q(^|mjt) represents the probability of observing^ when m * is transmitted. Thus, 
the receiver chooses that signal which, when transmitted, will maximize the likelihood (proba¬ 
bility) of observing the received q. Hence, this receiver is called the maximum likelihood 
receiver. Note that the maximum likelihood receiver is a Bayes receiver for the cost of 
Eq. (10.135) under the condition that the a priori message probabilities are equal. In terms 
of geometrical concepts, the maximum likelihood receiver decides in favor of that signal 
which is closest to the received data q. The practical implementation of the maximum like¬ 
lihood receiver is the same as that of the Bayes receiver (Figs, 10,18 and 10.19) under the 
condition that all a priori probabilities are equal to 1/M. 

If the signal set is geometrically symmetrical, and if all a priori probabilities are equal 
(maximum likelihood receiver), then the decision regions for various signals are congruent. 
In this case, because of symmetry, the conditional probability of a correct decision is the same 
no matter which signal is transmitted, that is, 

P(C\mi) = constant for all i 


Because 


M 

P (C) = '£P(m i )P(C\m i ) 
1 = 1 


in this case 


P(C) = P(C\mi) (10.137) 

Thus, the error probability of the maximum likelihood receiver is independent of the actual 
source statistics P(mi) for the case of symmetrical signal sets. It should, however, be realized 
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that if the actual source statistics were known beforehand, one could use Bayes' decision rule 
to design a better receiver. 

It is apparent that if the source statistics are not known, the maximum likelihood receiver 
proves very attractive for a symmetrical signal set, In such a receiver one can specify the error 
probability independently of the actual source statistics. 

Minimax Receiver 

Designing a receiver with a certain decision rule completely specifies the conditional 
probabilities P(C|m,). The probability of error is given by 

P eM - 1 - P(C) 

M 

= 1 

! — 1 

Thus, in general, for a given receiver (with some specified decision rule) the error probability 
depends on the source statistics P(mj) t The error probability is the largest for some source 
statistics. The error probability in the worst possible case is [/W]max an d represents the upper 
bound on the error probability of the given receiver. This upper bound [/^A/lmax serves as an 
indication of the quality of the receiver. Each receiver (with a certain decision rule) will have a 
certain \P e w ] mjK . The receiver that has the smallest upper bound on the error probability, that 
is, the minimum \P eM lmax, is called the minimax receiver. 

We shall illustrate the minimax concept for a binary receiver with on-off signaling. The 
conditional PDFs of the receiving-ft Iter output sample rat r = 7)> are p(r|l) and/?(r]G). These 
are the PDFs of r for the “on” and the “off” pulse (i.e. ? no pulse), respectively. Figure 10.35a 
shows these PDFs with a certain threshold a. If we receive r > a, we choose the hypothesis 
“signal present” (1), and the shaded area to the right of a is the probability of false alarm 
(deciding “signal present” when in fact the signal is not present). If r < a, we choose the 
hypothesis “signal absent” (0), and the shaded area to the left of a is the probability of false 
dismissal (deciding “signal absent” when in fact the signal is present). It is obvious that the 
larger the threshold a , the larger the false dismissal error and the smaller the false alarm error 
(Fig. 10.35b). 

We shall now find the minimax condition for this receiver. For the minimax receiver, we 
consider all possible receivers (all possible values of a in this case) and find the maximum 


Figure 10.35 

Explanation of 

minimax 

concept. 
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error probability (or cost) that occurs under the worst possible a priori probability distribution. 
Let us choose a = a\, as shown in Fig. 10.35b. In this case the worst possible case occurs 
when P(0) = 1 and ^(1) = 0, that is, when the signal s(t) is always absent. The type of 
error in this case is false alarm. These errors have a cost Ci. On the other hand, if we choose 
a = Q 2 , the worst possible case occurs when />(()) = 0 and P{ 1) = 1, that is, when the signal 
is always present, causing only the false-dismissal type of errors. These errors have a cost C 2 * 
It is evident that for the setting a = a, the costs of false alarm and false dismissal are equal, 
namely, C a . Hence, for all possible source statistics the cost is C tf . Because < C\ and C?, 
this cost is the minimum of the maximum possible cost (because the worst cases are considered) 
that accrues for all values of a. Hence, a = a represents the minimax setting. 

It follows from this discussion that the minimax receiver is rather conservative. It is 
designed under the pessimistic assumption that the worst possible source statistics exist. The 
maximum likelihood receiver, on the other hand, is designed on the assumption that all mes¬ 
sages are equally likely. It can, however, be shown that for a symmetrical signal set, the 
maximum likelihood receiver is in fact the minimax receiver. This can be proved by observing 
that for a symmetrical set, the probability of error of a maximum likelihood receiver (equal a 
prion probabilities) is independent of the source statistics [Eq. (10,137)]. Hence, for a sym¬ 
metrical set, the error probability P s m = a of a maximum likelihood receiver is also equal 
to its r/WW- We now show that no other receiver exists whose [/\>A/]ma* is less than the 
a of a maximum likelihood receiver for a symmetrical signal set. This is seen from the fact 
that for equiprobable messages, the maximum likelihood receiver is optimum by definition. 
All other receivers must have P eM > a for equiprobable messages. Hence, [FWlma* for these 
receivers can never be less than a. This proves that the maximum likelihood receiver is indeed 
the minimax receiver for a symmetrical signal set. 


10.11 NONCOHERENT DETECTION 

Tf the phase 0 in the received RF pulse V2 p f {t) cos (aj c t H- 9) is unknown, we can no longer 
use coherent detection techniques. Instead, we must rely on noncoherent techniques, such as 
envelope detection. It can be shown y ’ 10 that when the phase 9 of the received pulse is random 
and uniformly distributed over (0, 2 jt), the optimum detector is a filter matched to the RF 
pulse cos a) c t followed by an envelope detector, a sampler (to sample at f = 7^), and 

a comparator to make the decision (Fig. 10.36). 

Amplitude Shift Keying 

The noncoherent detector for ASK is shown in Fig. 10.36. The filter H(f) is a filter matched to 
the RF pulse, ignoring the phase. This means the filter output amplitude A p will not necessarily 
be maximum at the sampling instant. But the envelope will be close to maximum at the sampling 


Figure 10.36 

Noncoherent 
detection of 
digital modu¬ 
lated signals 
for ASK. 
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Figure 10.37 

Conditional PDFs 
in the 

noncoherent 
detection of ASK 
signals. 


instant (Fig* 10*36)* The matched filter output is now detected by an envelope detector* The 
envelope is sampled at t = Tt for making the decision* 

When a 1 is transmitted, the output of the envelope detector at t = T& is an envelope of a 
sine wave of amplitude A p in a Gaussian noise of variance * In this case, the envelope r has 
a Ricean density, given by [Eq. (9.86a)J 


p T (r\m = 1 ) = 



(10.138a) 


Also, when A p > <r n (small-noise case) from Eq. (9>86c), we have 


p r (r|m = l)~ / r , e -^~ A p") 2 / 2 ^ (10.138b) 

V 2 xA p a,f 

~ -E= e- <r -^ }2/2 ^ (10.138c) 

CT^ V 

Observe that for small noise, the PDF of r is practically Gaussian, with mean A p and variance 
When 0 is transmitted, the output of the envelope detector is an envelope of a Gaussian 
noise of variance The envelope in this case has a Rayleigh density, given by [Eq* (9*S 1)J 

p r (f|m = 0) = e~ r ' /2 ^' 


Bothp r (r|m = 1) andp r (r|m = 0) are shown in Fig* 1037* Using the argument used earlier 
(see Fig. 10*4), the optimum threshold is found to be the point where the two densities intersect* 
Hence, the optimum threshold a 0 is 



This equation is satisfied to a close approximation for 
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Because the matched filter is used, A,, = E p and a n 2 =J\fE p /2. Moreover, for ASK there are, 
on the average, only R b f 2 nonzero pulses per second. Thus, E b — E p j2 . Hence, 

(Ap\ 2 = 2E l= Et 

W V V 


and 


a 0 = E b 



(10,139a) 


Observe that the optimum threshold is not constant but depends on £ b fM\ This is a serious 
drawback in a fading channel. For a strong signal, E b fM' 1, 





(10.139b) 


and 


P(e|m — 0) = f /? r (r|m = 0) dr 

Ja p /2 


L 


■e- r ' /2 °«dr 


Ap/2 CT n 




= € 




(10.140) 


Also, 


P(e|m = 1) = 



p r (r|m = 1) dr 


Evaluation of this integral is somewhat cumbersome. 4 For a strong signal (that is, for E b /J\f > 
1), the Ricean PDF can be approximated by the Gaussian PDF [Eq + (9.86c)], and 


P(e\m = l) 



e -lr-Ap) 2 fr>i dr 


= Q 



(10.141) 


As a result, 


Pb = PmfOXelm = 0) + /’ m (l)/>(e|m = 1) 
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Assuming P m (l) = P m (0) = 0.5, 



(10.142a) 


Using the Q( ) approximation in Eq. (8.38a), 

p ^-2( t + 7zkw) e - iEt ' M Et/U>>l (10J42b) 

~ (10.142c) 

Note that in an optimum receiver, for Ehfj\ r '» 1, |m = 1) is much smaller than 
P{e|m = 0). For example, at EbfJ'sf = 10, F(e[m = 0) 2= 8.7P(e|m = 1). Hence, mistaking 
0 for 1 is the type of error that predominates. The timing information in noncoherent detection 
is extracted from the envelope of the received signal by methods discussed in Sec. 7.5.2. 

For a coherent detector, 


- , 1 E t) /N » 1 (10.143) 

■s/ 271 E b j A' 

This appears similar to Eq. (10.142c) (the noncoherent case). Thus for a large EbfAf. the 
performances of the coherent detector and the envelope detector are similar (Fig. 10.38). 

Frequency Shift Keying 

A noncoherent receiver for FSK is shown in Fig. 10.39. The filters Hq(/) and H\(f) are 
matched to the two RF pulses corresponding to 0 and 1, respectively. The outputs of the 
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Figure 10.39 

Noncoherent 
detection of 
binary FSK, 



envelope detectors at t = T b are and n , respectively- The noise components of outputs of 
filters Ho(f) and H\ (/’) are the Gaussian RVs no and n h respectively, with cr no = a n[ = 

If 1 is transmitted (m = 1), then at the sampling instant, the envelope n has the 
Ricean PDF* 


Puin) = 


n 


e -ir^A-)/ 2^ /o 



and ro is the noise envelope with Rayleigh density 


^ro ( r o) 


^0 


'o/ 2 ^ 


The decision is m — 1 if ri > ro and m = 0 if ri < ro. Hence, when binary 1 is transmitted, 
an error is made if ro > n. 


P{<F|m = 1) = P(r 0 > n) 


The event ro > rj is the same as the joint event “ri has any positive value' 1 ' and ro has a value 
greater than ri This is simply the joint event (0 < ri < oo, ro > n ) t Hence, 

P(e |m — 1) = P( 0 < v\ < oo, ro > n) 

f OQ f-00 

= / / J r,r 0 (n. r Q )dridro 

J 0 Jn 

Because ri and ro are independent, p, II() = p ri p, a . Hence, 

P( f \m=l)= [ e -(A+A 2 P )/2^ h) ( K e -rl/2ol dn 

JO V / A, 

Letting x — Vl n and a = A p /Vl, we have 

P(,\ m = 1, = r /Q ( x °) 

Jo <?n W/ 


drQ 


dx 


* An orthogonal FSK is assumed. This ensures that tq and r| have Rayleigh and Rico densities, respectively, when 1 
is transmitted. 

r ri is the envelope detector and can take oniy positive values. 
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Observe that the integrand is a Ricean density, and, hence, its integral is unity. Therefore, 

p(e[ m = 1) = \ e - A P^ (10.144a) 

Note that for a matched filter, 

2 _ _ 2Ep 

Pna * a n 2 ,V 

For FSK, E b = E p , and Eq. (10,144a) becomes 

P(e|m = 1) = £ (10.144b) 

Similarly, 

P(e|m = 0) = i (10.144c) 

and 

Pb = \ e-foW (10.145) 

This behavior is similar to that of noncoherent ASK |_Eq. (10,142c)]. Again we observe that 
for Efy/ftf 1, the performance of coherent and noncoherent FSK are essentially similar. 

From the practical point of view, FSK is to be preferred over ASK because FSK has a 
fixed optimum threshold, whereas the optimum threshold of ASK depends on E b jjsf (the signal 
level). Hence, ASK is particularly susceptible to signal fading. Because the decision of FSK 
involves a comparison between and ri, both variables will be affected equally by signal 
fading. Hence, channel fading does not degrade the noncoherent FSK performance as it does 
the noncoherent ASK. This is the outstanding advantage of noncoherent FSK over noncoherent 
ASK. In addition, unlike noncoherent ASK, probabilities P(e|m = 1) and /’{elm = 0} are 
equal in noncoherent FSK. The price paid by FSK for such an advantage is its larger bandwidth 
requirement. 


Noncoherent MFSK 

From the practical point of view, phase coherence of M frequencies is difficult to maintain. 
Hence in practice, coherent MFSK is rarely used. Noncoherent MFSK is much more common. 
The receiver for noncoherent MFSK is similar to that for binary noncoherent FSK (Fig. 10.39), 
but with M banks corresponding to M frequencies, in which filter Hiif ) is matched to the RF 
pulse p{t) cos Wit. The analysis is straightforward. If m = 1 is transmitted, then ri is the enve¬ 
lope of a sinusoid of amplitude A p plus bandpass Gaussian noise, and rj (j = 2, 3, ,.., M) is 
the envelope of the bandpass Gaussian noise. Hence, ri has Ricean density, andr 2 , r 3 , ,.., 
have Rayleigh density. From the same arguments used in the coherent case, we have 


E cm = P{C |in = 1} = P(0 < ri < oo, n 2 < n, n 3 < r h ..., < n) 

i 


-r>®) 


e ~(r 2 ,+A 2 p )/2c‘ 


-tf+Ah/2cr< 


■ (i - £)—'Wy 
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Substituting r\/2o 2 = x and {A p jo„) 2 = 2E p jM' = 2E h log M/A', we obtain 
PCM = «- <£ * 1 - «-')«-■ /„ ( 

Using the binomial theorem to expand (1 — r r ) Af " 1 , we obtain 


dx (10.146a) 


(1 -e'} 




-e(V) 

m=0 N 7 


(-1W 


Substitution of this equality into Eq. (10.146a) and recognizing that 

r^- i , 

/ .ye^ / 0 (M 

Jo 2a 

we obtain {after interchanging the order of summation and integration) 


and 


^CM 


PeM = 1 “ P CM 



( ^ -mE h logjAf/A^m+l) 

m+l 


' ^ log 2 M /jV’(m+1) 

m + 1 


(10.146b) 


(10.146c) 


The error probability P eM is shown in Fig, 10.40 as a function of EbfJ\f. It can be seen that 
the performance of noncoherent MFSK is only slightly inferior to that of coherent MFSK, 
particularly for large M . 


Differentially Coherent PSK 

Just as it is impossible to demodulate a DSB-SC signal with an envelope detector, it is also 
impossible to demodulate PSK (which is really DSB-SC) noncoherently. We can, however, 
demodulate PSK without the synchronous, or coherent, local carrier by using what is known 
as differential PSK (DPSK). 

The optimum receiver is shown in Fig, 10,41* This receiver is very much like a correlation 
detector (Fig. 10.3), which is equivalent to a matched filter detector. In a correlation detector, we 
multiply pulsep(f) by a locally generated pulsepO). In the case of DPSK, we take advantage of 
the fact that the two RF pulses used in transmission are identical except for the sign (or phase). 
In the detector in Fig. 10.41, we multiply the incoming pulse by the preceding pulse. Hence, 
the preceding pulse serves as a substitute for the locally generated pulse. The only difference 
is that the preceding pulse is noisy because of channel noise, and this tends to degrade the 
performance in comparison to coherent PSK. When the output r is positive, the present pulse 
is identical to the previous one, and when r is negative, the present pulse is the negative of 
the previous pulse. Hence, from the knowledge of the first reference digit, it is possible to 
detect all the received digits. Detection is facilitated by using so-called differential encoding, 
identical to what was discussed in Sec* 7.3.6 for duobinary signaling. 

To derive the DPSK error probability, we observe that DPSK by means of differential 
coding is essentially an orthogonal signaling scheme* A binary 1 is transmitted by a sequence 
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Figure 10.40 
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of two pulses (p, p) or (—/?, — p) over 27^ seconds (no transition). Similarly, a binary 0 is 
transmitted by a sequence of two pulses (p, — p) or (—/?, /;) over 27^ seconds (transition). 
Either of the pulse sequences used for binary 1 is orthogonal to either of the pulse sequences 
used for binary 0. Because no local carrier is generated for demodulation, the detection is 
noncoherent, with an effective pulse energy equal to 2E p (twice the energy of pulse p). The 
actual energy transmitted per digit is only E p , however, the same as in noncoherent FSK. 
Consequently, the performance of DPSK is 3 dB superior to that of noncoherent FSK. Hence 
from Eq. (10.145), we can write P^ for DPSK as 

P h = (10-147) 

This error probability (Fig. 10.42) is superior to that of noncoherent FSK by 3 dB and is 
essentially similar to coherent PSK for E^/jV ^ 1 [Eq. (10.39)]. This is as expected, because 
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Figure 10.42 

Error probability 
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we saw earlier that DPSK appears similar to PSK for large SNR. Rigorous derivation of 
Eq. (10.147) can be found in the literature, 7 

10.12 MATLAB EXERCISES 

In this group of computer exercises, we give readers an opportunity to test the implementation 
and the performance of basic digital communication systems. 


COMPUTER EXERCISE 10.1: BINARY POLAR SIGNALING WITH DIFFERENT PULSES 

In the first exercise, we validate the performance analysis of the binary polar signaling presented 
in Section 10,1, Optimum (matched filter) detection are always used at the receiver In the program 
ExlO_l ,m t three different pulses are used for polar signaling: 

* Rectangular pulse p(t) = u(t) — u(t - 7). 

* Half-sine pulse p{t) = sin - u(t - 7)], 

* Root-raised cosine pulse with roll-off factor r = 0.5 (or bandwidth 0.75/7) and truncated to duration 
of 67. 


% Matlab Program <ExlG_l.m> 

% This Matlab exercise <ExlO_l.m> performs simulation of 
% binary baseband polar transmission in AWGN channel, 

% The program generates polar baseband signals using 3 different 
% pulse shapes (root-raised cosine (r-0,5), rectangular, half-sine) 

% and estimate the bit error rate (BER) at different Eb/N for display 
clear;elf; 

L=10D00G0; % Total data symbols in experiment is 1 million 

% To display the pulse shape, we oversample the signal 
% by factor of f_ovsamp=8 

f_ovsamp=8; % Oversampling factor vs data rate 

delay_rc=3; 

% Generating root-raised cosine pulseshape (rolloff factor = 0,5) 
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prcos=rcosfIt([ 1 3, 1 ( f_ovsamp, 'sqrt' , 0.5, delay_rc); 

prcos=prcos(1:end-f_ovsamp+l}; 
prcos=prcos/norm(prcos); 
pcmatch=prcos{end:-1;1) ; 

% Generating a rectangular pulse shape 
prect = ones(1 r f_ovsamp); 
prect=prect/norm(prect); 
prmatch-prect(end:-1:1); 

% Generating a half-sine pulse shape 
psine=sin([0:f_ovsamp-l]*pi/f_ovsamp )■ 
psine=psine/norm(psine); 
psmatch=psine(end:-1:1) ; 

% Generating random signal data for polar signaling 
s_data = 2 *round(rand(L,1))-1; 

% upsample to match the 'fictitious oversampling rate' 

% which is f_ovsamp/T [T=l is the symbol duration) 
s_up=upsample(s_data,f_ovsamp); 


% Identify the decision delays due to pulse shaping 

% and matched filters 

delayrc=2 *delay_rc* f_ovsamp? 

delayrt=f_ovsamp-l; 

delaysn=f_ovsamp-l; 

% Generate polar signaling of different pulse-shaping 

xrcos=conv{s_up,prcos); 

xrect=conv{s_up,prect); 

xsine=conv{s_up,psine); 

t=(1:200)/f_ovsamp ; 

subplot(311) 

figwavel=plot(t,xrcos(delayrc/2:delayrc/2+199)); 
title('{a) Root-raised cosine pulse.'); 
set{figwavel,'Linewidth',2); 
subplot(312 } 

figwave2-plot(t,xrect(delayrt:delayrt+199)); 
titie('(b) Rectangular pulse.') 
set(figwave2,'Linewidth' , 2) ; 
subplot(313) 

figwave3=plot(t,xsine(delaysn:delaysn+199)}; 
title(' (c) Half-sine pulse *') 
xlabel('Number of data symbol periods') 
set(figwave3,'Linewidth',2); 

% Find the signal length 

Lrcos=length(xrcos);Lrect=length(xrect);Lsine=length(xsine); 


BER= [ ] ; 

noiseq=randn(Lrcos, 1) ; 

% Generating the channel noise {AWGN) 
for 1=1:10, 


Eb2N(i)=i; 

Eb2N_num=10"(Eb2N(i)/10); 
Var_n=l/{2*Eb2N_nuin); 
signois=sqrt(Var_n); 
awgnois=signois*noiseq; 

% Add noise to signals at the 
yrcos=xrcos+awgnois; 


%{Eb/N in dB) 

% Eb/N in numeral 
%1/SNR is the noise variance 
% standard deviation 
% AWGN 

channe1 output 
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yrect=xrect+awgnois(1:Lrect); 
ysine=xsine+awgnois(1:Lsine); 

% Apply matched filters first 
zl=conv(yrcos,pcmatch);clear awgnois, yrcos; 
z2=conv(yrect,prmatch);clear yrect; 
z3=conv(ysine,psmatch);clear ysine; 

% Sampling the received signal and acquire samples 
zl=zl(delayrc+1:f_ovsamp:end); 
z2 = z2(delayrt+1:f_ovsamp:end); 
z3 = z3(delaysn+1:F_ovsamp:end); 

% Decision based on the sign of the samples 

decl=sign(zl(1:L));dec2=sign(z2(1:L));dec3=sign(z3(1:L)); 

% Now compare against the original data to compute BER for 
% the three pulses 

BER=[BER;sumfabs{s_data-decl))/(2*L)... 
sum (abs(s_data-dec2))/{2*L) 
sum [abs(s_data-dec3))/(2*L)]; 

G(i)=0.5*erfc(sqrt(Eb2N_num)); ^Compute the Analytical BER 

end 

figure(2) 
subplot(111) 

figber = semilogy(Eb2N, Q, 'k-',Eb2N,BER(:,1) , 'b- + ', t , , 

Eb2N,BER(; ,2), H r-O',Eb2N,BER(*,3),'m-v'); 
legend('Analytical r , 'Root-raised cosine','Rectangular','Half-sine') 
xlabel('E_b/N (dB)');ylabel('BER') 
set(figber,'Linewidth', 2) ; 
figure(3} 

% Spectrum comparison 

[Psdl,f]=pwelch(xrcos,[],[],[],'twosided',f_ovsamp); 

[Psd2,f]=pwelch(xrect,[],[],[],'twosided',f_ovsamp); 

[Psd3,f]=pwelch(xsine,[],[],[],'twosided',f_ovsamp); 
figpsdl=semilogy(f-f_ovsamp/2,fftshift(Psdl)); 
ylabel('Power spectral density'); 
xlabel( H frequency in unit of {1/T}'); 

ttl=title('(a) PSD using root-raised cosine pulse (rolloff factor r=0.5)') 
set(ttl,'FontSize',11) ; 
figure(4) 

figpsd2=semilogy(f-f_ovsamp/2,fftshift(Psd2)); 

ylabel('Power spectral density'); 

xlabel('frequency in unit of {1/T}');; 

tt2=title('(b) PSD using rectangular NRZ pulse'); 

set(tt2,'FontSize',11); 

figure(5) 

figpsd3=semilogy(f-f_ovsamp/2,fftshift(Psd3)); 
ylabel('Power spectral density'); 
xlabel('frequency in unit of {1/T}'); 
tt3=titie('(c) PSD using half-sine pulse'}; 
set(tt3,'FontSize',11); 


This program first shows the polar modulated binary signals in a snapshot given by Fig. 10 + 43. The 
3 different waveforms are the direct results of their different pulse shapes. Nevertheless, their bit error 
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Figure 10.43 

Snapshot of the 
modulated 
signals from 
three difference 
pulse shapes: 

(a) root-raised 
cosine pulses,, 
of rolloff 
factor=G,5; 

(b) rectangular 
pulse; 

(c) half-sine 
pulse pulse. 



Figure 10.44 

BER of optimum 
(matched filter) 
detection of 
polar signaling 
using three diffe¬ 
rence pulse 
shapes: 

(a) root-raised 
cosine pulse 
of roll-off 
factor 0.5; 

(b) rectangular 
pulse; 

(c) half-sine 
pulse. 



rate (BER) performances are identical, as shown in Fig. 10.44. This confirms the results from Sec. 10.1 
that the polar signal performance is independent of the pulse shape. 

The program also provides the power spectral density (PSD) for binary polar signaling using the 
three different modulated signals. From Fig, 10,45, we can see that that the root-raised cosine pulse 
clearly requires the least bandwidth. The half-sine signaling exhibits larger main lobe but smaller overall 
bandwidth. The sharp-edged rectangular pulse is the least bandwidth efficient. Thus, despite registering 


Figure 10.45 

Power spectral 
density of the 
binary polar 
transmission 
using three 
difference pulse 
shapes: 

(a) root-raised 
cosine pulse 
of roll-off 
factor 0.5; 

(b) rectangular 
NRZ pulse, 

(c) half-sine pulse 
pulse. 


Power spectral density Power spectral density Power spectral density 
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the same BER from simulation, the three different polar modulations require drastically different amount 
of channel bandwidth. 


COMPUTER EXERCISE 10.2: ON-OFF BINARY SIGNALING 

Next, we present an exercise that implements and tests the on-off signaling as well as a more generic 
orthogonal type of signaling. Recall that on-off signaling is a special form of orthogonal binary signaling, 
MATLAB program Exl 0_2 * m will measure the receiver BER of both signaling schemes, 

% MATLAB PROGRAM <ExlO_2.m> 

% This Matlab exercise <ExlO_2.m> generate 
% on/off baseband signals using root-raised cosine 
% pulseshape {rolloff factor = 0*5) and orthogonal baseband 
% signal before estimating the bit error rate (BER) at different 
% Eb/N ratio for display and comparison 
clear;elf 

L=1GQ00GQ; % Total data symbols in experiment is 1 million 

% To display the pulse shape, we oversample the signal 
% by factor of f_ovsamp=3 

f_ovsamp=l6; % Oversampling factor vs data rate 

delay_rc=3; 

% Generating root-raised cosine pulseshape (rolloff factor = 0.5) 
prcos=rcosfIt{[ 1 J, 1, f_ovsamp, 'sqrt' , 0.5, delay_rc); 
prcos=prcos(1:end-f_ovsamp+l); 
prcos=prcos/norm(prcos); 
pcmatch^prcos{end:-1:1); 

% Generating a rectangular pulse shape 
psinh=sin([0:f_ovsamp-l]*pi/f_ovsamp); 
psinh=psinh/norm (psinh) 
phmatch-psinh(end:-1:1); 

% Generating a half-sine pulse shape 
psine=sin([0:f_ovsamp-l]*2*pi/f_ovsamp); 
psine=psine/norm(psine); 
psmatch=psine(end:-1:1); 

% Generating random signal data for polar signaling 
s_data=round(rand(L,1)); 

% upsample to match the 'fictitious oversampling rate' 

% which is f_ovsamp/T (T=l is the symbol duration) 
s_up-upsample(s_data,f_ovsamp); 
s_cp=upsample (l-s_data, f_ovsamp) 

% Identify the decision delays due to pulse shaping 
% and matched filters 
delayrc=2*delay_rc*f_ovsamp; 
delayrt~f_ovsamp-l; 

% Generate polar signaling of different pulse-shaping 
xrcos=conv(s_up,prcos); 

xorth=conv (s_up,psinh) +conv( s_cp,psine); 
t = (1:2 00) / f_OVSamp; 
figure(l) 
subplot(211) 

figwavel=plot(t,xrcos{delayrc/2:delayrc/2+199}); 
title('(a) On/off root-raised cosine pulse.'); 
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set(figwavel,'Linewidth' , 2 ) ; 
subplot(212) 

figwave2=plot(t,xorth(delayrt;delayrt+199)); 
title[' (b) Orthogonal modulation.'} 
set(figwave2 „'Linewidth',2); 

% Find the signal length 

Lrcos=length(xrcos);Lrect=length(xorthj ; 

BER= [ ] ; 

noiseq^randn(Lrcos,1); 

% Generating the channel noise (AWGN) 
for i=l:12, 


Eb2N(i)=i; 

Eb2N_num=ltT(Eb2N(i)/10>; 
Var_n=l/(2 *Eb2N_num); 
signois=sqrt(Var_n); 
awgnois=signois*noiseq; 

% Add noise to signals at t 
yrcos=xrcos+awgnois/sqrt{2) 
yorth=xorth+awgnois(1:Lrect 


%(Eb/N in dB) 

% Eb/N in numeral 
%1/SNR is the noise variance 
% standard deviation 
% AWGN 

channel output 


% Apply matched filters first 

zl=conv[yrcos,pcmatch);clear awgnois H yrcos; 

z2=conv(yorth,phmatch); 

z3=conv{yorth,psmatch);clear yorth; 

% Sampling the received signal and acquire samples 
zl=zl(delayrc+1:f_ovsamp:end); 
z2=z2[delayrt+1:f_ovsamp:end-f_ovsamp+l); 
z3 = z3 [delayrt-f-1 r f_ovsamp: end-f_ovsamp+l) ; 

% Decision based on the sign of the samples 

decl=round{(sign[zl(l:L)-0.5)+i)*,5);dec2=round((sign(z2-z3)+l)*.5 
% Now compare against the original data to compute BER for 
% the three pulses 

BER=[BER;sumfabs(s_data-decl))/L sum(abs(s_data-dec2))/L]; 

Q ( i)=0■5*erfc(sqrt(Eb2N_num/2)); % Compute the Analytical BER 

end 

figure(2) 
subplot[111) 

figber^semilogy(Eb2N,Q,'k-',Eb2N,BER(:,1),'b-*',Eb2N,BER(;,2),'r-o'); 
fleg=legend('Analytical', 'Root^raised cosine on/off','Orthogonal 

signaling'); 

fx=xlabel('E_b/N (dB)');fy=ylabel('BER'); 

set(figber,'Linewidth',2);set(fleg,'FontSize',11); 

set(fx, 'FontSize',11); 

set(fy,'FontSize',11); 

% We can plot the individual pulses used for the binary orthogonal 
% signaling 
figure(3) 
subplot(111) ; 

pulse=plot((0:f_ovsamp)/f_ovsamp,[psinh 0],'k-',... 

(0:f_ovsamp)/f_ovsamp,[psine 0],'k-o'); 
pleg=legend{'Half-sine pulse', 'Sine pulse'); 
ptitle=title('Binary orthogonal signals'}; 
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Figure 10.46 

Waveforms of 
the two pulses 
used in 
orthogonal 
binary signaling: 
solid curves 
half-sine pulse; 
curve with 
circles, sine 
pulse. 


Figure 10.47 

Measured BER 
results rn 
comparison with 
analytical BER. 




set(pulse,'Linewidth',2); 
set(pleg, 1 Fonts!ze' , 10) ; 
set(ptitle,'Fontsize' , 11) ; 


For the on-off signaling, we will continue to use the root-raised cosine pulse from Computer Exer¬ 
cise 10.1. For a more generic orthogonal signaling, we use two pulse shapes of length T. Figure 1046 
shows these orthogonal pulses. Finally, Fig. 1047 displays the measured BER for both signaling schemes 
against the BER obtained from analysis. It is not surprising that both measured results match the analytical 
BER very well. 
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Figure 10.48 

Eye diagram of 
the real 
(in-phase) 
component of the 
16-GAM 
transmission at 
the receiver 
matched filter 
output. 
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COMPUTER EXERCISE 10,3: 16-QAM MODULATION 

In this exercise, we will consider a more complex QAM constellation for transmission. The M-ary QAM 
was analyzed in Sec. 10,6,6. In MATLAB program Exl0_3 . m ? we control the transmission bandwidth 
by applying the root-raised cosine pulse w ith roll-off factor of 0,5 as the baseband pulse shape. For each 
symbol period T y eight uniform samples are used to approximate and emulate the continuous time signals. 
Figure 10,48 illustrates the open eye diagram of the in-phase (real) part of the matched filter output prior 
to being sampled. Very little ISI is observed at the point of sampling, validating the use of the root-raised 
cosine pulse shape in conjunction with the matched filter detector for iSl-free transmission, 

% Matlab Program <ExlO_3.m> 

% This Matlab exercise <ExlO_3.m> perforins simulation of 
% QAM-16 baseband polar transmission in AWGN channel. 

% Root-raised cosine pulse of rolloff factor = 0.5 is used 
% Matched filter receiver is designed to detect the symbols 
% The program estimates the symbol error rate (BER) at different Eb/N 
clear;clf; 

h=1000000; % Total data symbols in experiment is 1 million 

% To display the pulse shape, we oversample the signal 
% by factor of f_ovsamp-8 

f_ovsamp=8; % Oversampling factor vs data rate 

delay_rc=4; 

% Generating root-raised cosine pulseshape (rolloff factor = 0.5) 
prcos-rcosfIt([ 1 ] , 1 H f_ovsamp, 'sqrt', 0.5, delay_rc); 
prcos=prcos(1:end-f_ovsamp+l); 
prcos=prcos/norm(prcos) ; 
pcmatch=prcos(end:-1:1); 

% Generating random signal data for polar signaling 
s_data=4 *round{rand(L,1))+2*round(rand(L,1))-3 + . . . 

+j*(4*round(rand{L,1))+2*round(rand{L,1)>-3); 

% upsample to match the 
% 'oversampling rate' 
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% which is f_ovsamp/T (T-l is the symbol duration) 
s_up=upsample(s_data,f_ovsamp); 

% Identify the decision delays due to pulse shaping 
% and matched filters 
delayrc=2*delay_rc* f_ovsamp; 

% Generate QAM-16 signaling with pulse-shaping 
xrcos=conv(s_up r prcos) ; 


% Find the signal length 
Lrcos=length(xrcos); 

SER=[ J ; 

noiseq=randn(Lrcos,1)+j*randn(Lrcos,1 ); 
Es=lQ; % symbol energy 

% Generating the channel noise (AWGN) 
for i=l:9. 


Eb2N(i ) =i*2; 

Eb2 N_num=10 ~(Eb2N(i)/10) ; 
Var_n=Es/ {2 + Eb2N m _num) ; 
signois=sqrt(Var_n/2); 
awgnois=signois*noiseq; 

% Add noise to signals at the 
yrcos=xrcos+awgnois; 


%{Eb/N in dB) 

% Eb/N in numeral 
%1/SNR is the noise variance 
% standard deviation 
% AWGN 
■1 output 


% Apply matched filters first 
zl=conv{yrcos,pcmatch);clear awgnois, yrcos; 

% Sampling the received signal and acquire samples 
zl = zl(delayrc+1:f_ovsamp:end); 


% Decision based on the sign of the samples 
decl=sign(real(zl(1:L)))+sign(real(zl(l;L))-2}+... 
sign(real(zl[1:L)}+2)+. 

j*(sign(imag(zl(1:L))}+sign(imag(zl(1:L))-2)+ <.* 
sign(imag(zl(1:L) )+2)) ; 

% Now compare against the original data to compute BER for 
% the three pulses 

%BER=[BER;sum(abs(s_data-decl))/(2*L)] 

SER=[SER;sum(s_data~=decl)/L]; 

Q(i)=3*0.5*erfc(sqrt((2 *Eb2N__num/5)/2 ) ); 

%Compute the Analytical BER 
end 


figure(1) 
subplot(111) 

figber=semilogy(Eb2N,Q,'k-',Eb2N,SER,'b-*'); 
axis{[2 13 .99e~5 1]); 

legend('Analytical', 'Root-raised cosine r ); 

xlabel('E_b/N (dB ) f ) ;ylabel('Symbol error probability'); 

set[figber H 'Linewidth r ,2); 

% Constellation plot 
figure(2) 
subplot(111) 

plot(real(zl(1:min(L,4000))) ,imag(zl(1:min(L,4000))},'.'); 
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Figure 10.49 

Symbol error 
probability of 
16-QAM using 
root-raised 
cosine pulse in 
comparison with 
the analytical 
result. 



E b /K dB 


axis{'square'} 

xlabel('Real part of matched filter output samples'] 
ylabel('Imaginary part of matched filter output samples') 

Because the signal uses 16-QAM constellations, instead of measuring the BER, we will measure 
the symbol error rate (SER) at the receiver Figure 10,49 illustrates that the measured SER matches the 
analytical result from Sec. 10.6 very closely. 


The success of the optimum QAM receiver can also be shown by observing the real part 
and the imaginary part of the samples taken at the matched filter output. By using a dot to 
represent each measured sample, we create what is known as a “scatter plot,” which clearly 
demonstrates the reliability of the decision that follows. If the dots in the scatter plot are closely 
clustered around the original constellation point, then the decision is mostly likely going to 
be reliable. Conversely, large number of decision errors can occur. Figure KX50 illustrates 
the scatter plot from the measurement taken at the receiver when E b /A r = IS dB, The close 
clustering of the measured sample points is a strong indication that the resulting SER will be 
very low. 


COMPUTER EXERCISE 10.4: NONCOHERENT FSK DETECTION 

To test the results of a noncoherent binary FSK receiver, we provide MATLAB program ExIG_4 .m, 
which assumes the orthogonality of the two frequencies used in FSK, As expected, the measured BER 
results in Figure 10.51 matches the analytical BER results very well. 


% MATLAB PROGRAM <ExlO_4.m> 

% This program provides simulation for noncoherent detection of 
% orthogonal signaling including BFSK, Noncoherent MFSK detection 
% only needs to compare the magnitude of each frequency bin. 
L=100000; ^Number of data symbols in the simulation 
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Figure 10.50 

Scatter plot of 
the matched filter 
output for the 
16-GAM 
signaling with 
root-ra i sed 
cosine pulse 
when 

= is dB. 
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Figure 10.51 

BER from 
noncoherent 
detection of 
binary FSK. 



s_data=round(rand(L,1)); 

% Generating random phases on the two frequencies 
xbasel=[exp[j*2*pi*rand) 0]; 
xbaseO-[0 exp(j *2 *pi*rand)]; 

% Modulating two orthogonal frequencies 
xmodsig=s_data*xbasel+-(l-s_data)*xbaseO; 

% Generating noise sequences for both frequency channels 
noisei=randn(L,2); 
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noiseq=randn(L , 2}; 

BER= [ ] ; 

BER_az=[] * 

% Generating the channel noise (AWGN} 
for i = l:12 H 


Eb2N(i)=i ; 


%(Eb/N in dB) 


Eb2N_num=10"{Eb2N(i)/10); % Eb/N in numeral 

Var_n=l/{2*Eb2N_num); %1/SNR is the noise variance 

signois-sqrt(Var_n); % standard deviation 

awgnois=signois*(noisei+j*noiseq); % AWGN complex channels 

% Add noise to signals at the channel output 
ychout=xmodsig+awgnois; 


% Non-coherent detection 


ydiml-abs[ychout(:,!))■ 
ydim2=abs{ychout(:,2)); 
dec={ydiml>ydim2); 

% Compute BER from simulation 
BER=[BER; sum{dec~=S_data>/L]; 

% Compare against analytical BER. 
BER_az= [BER_az ; 0.5 *exp (-Eb2N_num/2 ) ] ; 

end 


figber=semilogy(Eb2N,BER_az, 'k - f ,Eb2N,BER, ' k-o'}; 
set{figber , f Linewidth',2); 

legend( 1 Analytical BER', 'Noncoherent FSK simulation'); 
fx=xlabel{'E_b/N (dB)'); 
fy=ylabel{'Bit error rate'); 

set (fx ,'Font Size ',11) ; set( fy ,'Fonts ize ',11); 


COMPUTER EXERCISE 10.5: NONCOHERENT DETECTION OF BINARY 
DIFFERENTIAL PSK 

To test the results of a binary differential phase shift keying system, we present MATLAB program 
ExlG_5 .m, As in previous cases, the measured BER results in Figure 10,52 matches the analytical BER 
results very well. 


% MATLAB PROGRAM <ExlO_5.m> 

% This program provides simulation for differential detection of 
% binary DPSK. Differential detetion only needs to compare the 
% successive phases of the signal samples at the receiver 

% 

clear;elf 

L=1QQGQ0Q; %Number of data symbols in the simulation 

s_data=round[rand[L,1}); 

% Generating initial random phase 
initphase=[2 *rand]; 

% differential modulation 
s_denc=mod(cumsum[[0;s_data]>,2 >; 

% define the phase divisible by pi 
xphase=initphase+s_denc; 
clear s_denc; 

% modulate the phase of the signal 
xmodsig=exp{j *pi*xphase); clear xphase; 
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Figure 10.52 

Analytical BER 
results from 
noncoherent 
detection of 
binary DP5K 
simulation (round 
points). 
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Lx=length(xmodsig); 

% Generating noise sequence 
noiseq=randn(Lx,2 )■ 

BER=[]; 

BER_az=[]; 

% Generating the channel noise (AWGN) 
for i=l:ll f 

Eb2N(i)=i; %[Eb/N in dB) 

Eb2N_num=lG"[Eb2N(i)/10); % Eb/N in numeral 

Var_n=l/(2*Eb2N_num); %1/SNR is the noise variance 

signois^sqrt(Var_n); % standard deviation 

awgnois=signois*{noiseq*[1?j ]); % AWGN complex channels 

% Add noise to signals at the channel output 
ychout=xmodsig+awgnois; 

% Non-coherent detection 

yphase=angle(ychout); %find the channel output phase 

clear ychout; 

ydfdec-diff(yphase)/pi; ^calculate phase difference 

clear yphase; 

dec=tabs(ydfdec)>0.5); %make hard decisions 

clear ydfdec; 

% Compute BER from simulation 
BER=[BER; sum(dec~=s_data)/L]; 

% Compare against analytical BER. 

BER_a z =[BER_ac; 0.5 *exp(-Eb2N_num) ] ; 

end 

% now plot the results 

figber = semilogy(Eb2N H BER_az, 'k- ' ,Eb2N,BER, 'k-o'); 
axis([1 11 ,99e-5 1] ) ; 
set(figber, ' Linewidth' ,2) ; 
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legend('Analytical BER', "Binary DPSK simulation"); 
fx=xlabel("E_b/N {dB)'); 
fy=ylabel['Bit error rate"); 

set{fx,'FontSize',11); set(fy,"Fontsize",11); 
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PROBLEMS 


10.1-1 The so-called integrate-and-dump filter is shown in Fig. PlO.1-1. The feedback amplifier is an 
ideal integrator. The switch S[ closes momentarily and then opens at the instant t = 7^, thus 
dumping all the charge on C and causing the output to go to zero. The switch s 2 samples the 
output immediately before the dumping action. 

(a) Sketch the output p 0 (t ) when a square pulse p{t) is applied to the input of this filter. 

(b) Sketch the output p 0 (t) of the filter matched to the square pulse p(t). 

(c) Show that the performance of the integrate-and-dump filter is identical to that of the 
matched filter; that is, show that p in both cases is identical. 


Figure 

P.10.1-1 


p(t) 




Tb 
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10.1-2 An alternative to the optimum filter is a suboptimum filter, where we assume a particular filter 
form and adjust its parameters to maximize p. Such filters are inferior to the optimum filter but 
may be simpler to design. 

For a rectangular pulse p(t) of heights and width T\ ? at the input (Fig, P10.1-2), determine 
Pmax instead of the matched filter, a one-stage RC filter with H (w) = 1 / (l + jtoRc) is used. 
Assume a white Gaussian noise of PSD A/72. Show that the optimum performance is achieved 
when \/RC= \26/T h . 

Hint: Set dp 2 /dx = 0 (x = T} } /RC). 


Figure 

P.10,1-2 


m 


n r- 


(a) 



10.2- 1 In coherent detection of a binary PPM, a half-width pulse pq(t) of is transmitted with different 

delays for binary digit “0” and “1" over 0 < t < T b . Note that 

p 0 (t) = u(t) - u(t-T b /2) 

The binary PPM transmission is to simply transmit 

P 0 (t), if “0” is sent 

po(t ~ Tb/2), if‘T’is sent ■ 

The channel noise is AWGN with spectrum level of A'/2, 

(a) Determine the optimum receiver architecture for this binary system. Sketch the optimum 
receiver filter response in the time domain, 

(b) If Pro"] = 0.4 and P[‘T*j = 0,6, find the optimum threshold and the resulting receiver 
bit error rate, 

(c) The receiver was misinformed and believes that P[ i4 0”] = 0,5 = P[*‘1”J. It hence designed 
a receiver based on this information. Find the true probability of error when, in fact, the 
actual prior probabilities areP[ u (T] = 0.4 and PPT'] = 0.6. Compare this result with the 
result in part (b), 

10.2- 2 In the coherent detection of binary chirp modulations, the transmission over 0 < t < T b is 

A cos(oeo£ 2 + 0 q)» if tl 0 M is sent 
A cosrfqf 2 + ), if“l” issent 

The channel noise is AWGN with spectrum Af /2. The binary digits are equally likely, 

(a) Design the optimum receiver. 

(b) Find the probability of bit error for the optimum receiver in part (a). 

10.2- 3 In coherent schemes, a small pilot is added for synchronization. Because the pilot does not 

carry information, it causes degradation in P b . Consider a coherent PSK that uses the following 
two pulses of duration T b \ 


pit) = avT — m 2 cos co c t + A m sin <ii c t 
q{t) = —Ay /1 - m 2 cos oj c t + Am. sin oj c t 
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where A m sin w c t is the pilot. Show that when the channel noise is white Gaussian, 


p b ~ Q 


' \ 

\ 


2E b {\-mh 

,V 


Hint: Use Eq. (10.25b). 

10*2-4 For polar binary communication systems, each error in the decision has some cost. Suppose 
that when m = 1 is transmitted and we read it as m = 0 at the receiver, a quantitative penalty, 
or cost, C 10 is assigned to such an error, and, similarly, a cost C(q is assigned when m = 0 is 
transmitted and we read it as m = 1. For the polar case where /Vi(0) = Anti) = 0.5, show 
that for white Gaussian channel noise, the optimum threshold that minimizes the overall cost 
is not 0 but is given by 


4 C io 


Hint: See Hint for Prob. 8.2-11. 

10,2-5 For a polar binary system with unequal message probabilities, show that the optimum decision 
threshold a G is given by 



/WOJCqi 

AnCDCiO 


where Cq\ and Cio are the cost of the errors as explained in Prob. 10,2-4, and F m ( 0) and /Vi(l) 
are the probabilities of transmitting 0 and 1, respectively. 

Hint: See Hint for Prob. 8.2-11. 

10*2-6 For 4-ary communication, messages are chosen from any one of four message symbols, m i = 
(X), m 2 = 01, = 10, and = 11, which are transmitted by pulses d=/?(f), 0, and ±3p(r), 

respectively. A filter matched to p(t) is used at the receiver. Denote the energy of p{t ) as E p . 
The channel noise is AWGN with spectrum N 72, 

(a) If r is the matched filter output at plot prMm/) (00, 01,10, and 11) fertile fourmessage 

symbols, assuming that all message symbols are equally likely. 

(b) To minimize the probability of detection error in part (a), determine the optimum decision 
thresholds and the corresponding error probability P e as a function of the average symbol 
energy to noise ratio. 

10,2-7 Binary data is transmitted by using a pulse p(t) for 0 and a pulse yp(t) for 1. Let y > l. Show 
that the optimum receiver for this case consists of a filter matched to p(t) plus a detection 
threshold as shown in Fig, PI 0.2-7. Determine the error probability P^ of this receiver as a 
function of E^/M if 0 and 1 are equiprobable. 


P( T tr 0 


1 = T b 


Figure 
P* 10,2-7 


r 


Decision: 

0 if i < threshold 
1 if r > threshold 
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10.2-8 In a binary transmission, a raised-cosine roll-off pulse p{f) with roll-off factor 0.2 is used for 
baseband polar transmission. The ideal low-pass channel has a bandwidth of = 5000 Hz. 

(a) If the channel noise is AWGN with spectrum M/2, find the optimum receiver filter and 
sketch its frequency response. 

(b) If the channel noise is Gaussian with spectrum 


Stiff) = 0.5 A' 


1 

i+tf'/A ) 2 


find the optimum receiver filter and sketch its frequency response. 


10.3-1 In an FSK system, RF binary signals are transmitted as 


0 : V2sin(^r/7^)cos [w c - (Aw/2)]r 0 < t < Tj } 

1 : V2sin(^t/r^)cos Uo c + (Aw/2)]; 0 < t < T b 

The channel noise is AWGN. Let the binary inputs be equally likely, 

(a) Derive the optimum coherent receiver and the optimum threshold. 

(b) Find the minimum probability of bit error, 

(c) Is the possible to find the optimum Aw to minimize the probability of bit error? 

10.4-1 Consider four signals in the time interval (0, 7): 


7o(0 = «(0 - w(r - T) 

PI (0 = sin (2iu/T)[u(t) - u(t - T)] 
P 2 (t) = sin (7ztfT)[u(t)-u(t-T)] 
p 3 (t) - cos {iU/T)[u(t) - u(t - T)] 


Apply the Gram-Schmidt procedure and find a set of orthonormal basis signals for this signal 
space. What is the dimension of this signal space? 


10,4-2 The basis signals of a three-dimensional signal space are given by ^>i(r) = /?(;), ^(f) = 
p(t — T 0 ). and ip 3 {t) = p(t - 2T 0 )> where 


P(0 = 



MO - u(t - 


To)] 


(a) Sketch the waveforms of the signals represented by (1, 1,1), (-2, 0, 1), (1/3, 2, - 

and (— — 1, 2) in this space. 

(b) Find the energy of each signal in part (a). 

10,4-3 Repeat Prob. 10.4-2 if 




y/To 


<P2(t) = 


2 71 

— cos— f <P 3 (r) 

*0 to 



0 < t < To 
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10*4-4 For the three basis signals given in Prob. 10.4-3, assume that a signal is written as 

x(0 = 1 + 2sin 3 

(a) Use the three basis signals in terms of minimum error energy to find the best approximation 
of What is the minimum approximation error energy? 

(b) By adding another basis signal 


V>4(0 = 



0 < t < T 0 


find the reduction of minimum approximation error energy. 
10*4-5 Assume that p{i) is as in Prob* 10*4-2 and 


<Pkit) = p[t - (k - \)T 0 \ k = 1,2,3,4,5 

(a) Sketch the signals represented by (-L, 2, 3, 1, 4), (2, 1, -4, -4, 2), (3, -2, 3, 4, l),and 
(-2, 4, 2, 2, 0) in this space. 

(b) Find the energy of each signal. 

(c) Find the angle between all pairs of the signals. 

Hint: Recall that the inner product between vectors a and b is related to the angle 0 between 
the two vectors via <a,b >= |jn|| ■ J|&j| cos(0). 

10,5-1 Assume that p(t) is as in Prob. 10.4-2 and 


*k(t) = Pi* - (k ~ l)T 0 ] k = 3 , 4,5 

When ( t ) is transmitted, the received signal under noise r\ w (0 is 
y(0 =j ft (r) + n w (f) 0 < t < 5T 0 

Given a noise /%(?) that is white Gaussian with spectrum Nf2, complete the following. 

(a) Define a set of basis functions for y(f) such that 

£{|y(0-X>w(f)l 2 } = 0 

(b) Characterize the random variable y ; when (r) is transmitted, 

(c) Determine the joint probability density function of random variable {y i.y 5 } when 

tfjt (0 is transmitted* 

10.5- 2 For a certain stationary Gaussian random process x(f), it is given that R x (r) = c _r2 . Determine 

the joint PDF of RVs x(f), \(t + 0*5), \(t 4-1), and x(f + 2). 

10.5- 3 A Gaussian noise is characterized by its mean and its autocorrelation function. A stationary 

Gaussian noise x{;> has zero mean and autocorrelation function /? x (r). 

(a) If x0) is the input to a linear time-invariant system with impulse response h(t), determine 
the mean and the autocorrelation function of the linear system output y(0^ 
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(b) If x(f) is the input to a linear time-varying system whose output is 

y(0 = f h(r, t)x(t) rfr 
J -o c 

show what kind of output process this generates, and determine the mean and the 
autocorrelation function of the linear system output y(f), 

10.5- 4 Determine the output PSD of the linear system in pan (a) of Prob, 10,5-3, 

10.5- 5 Determine the output PSD of the linear system in part (b) of Prob. 10.5-3. 

10.6- 1 Consider the preprocessing of Fig. K). 17. The channel noise n tv (r) is while Gaussian. 

(a) Find the signal energy of r(0 and q(/) over the finite time interval [0, 7 m ]. 

(b) Prove that although r(r) and q(f) are not equal, both contain all the useful signal content. 

(c) Show that the joint probability density function of (q i, q 2 , ,,. ,q/y), under the condition 
that 5 j. (?) is transmitted, can be written as 

= (7IjV') N ' 2 “ P (‘ ll? " Stl| 7^) ' 


10.6-2 Consider an additive white noise channel. After signal projection, the received A' x 1 signal 
vector is given by 


q = Si + n 


when message m ; is transmitted. The noise vector n has joint probability density function 



(a) Find the (MAP) detector that can minimize the probability of detection error. 

(b) Follow the derivations of optimum detector for AWGN noise to derive the optimum receiver 
structure for this non-Gauss! an white noise channel. 

(c) Show how r the decision regions arc different between Gaussian and non-Gaussian noises 
in a two-dimensional (N = 2) signal space, 

10.6- 3 A binary source emits data at a rate of 400,000 bit/s. Multi amplitude shift keying (PAM) with 

M — 2, 16, and 32 is considered. In each case, determine the signal power required at the 
receiver input and the minimum transmission bandwidth required if 5 n (cu) = 10 -8 and the bit 
error rate is required to be less than 10~ 6 . 

10.6- 4 Repeat Prob. 10.6-3 for M -ary PSK + 

10.6- 5 A source emits M equiprobable messages, which are assigned signalssj, s 2 .as shown 

in Fig. PI0.6-5, Determine the optimum receiver and the corresponding error probability P ei \$ 
for an AWGN channel as a function of 




Figure 

P.10,6-5 
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M/2 


a- 


■a 


V 


‘ S ,W 


10.6-6 A source emits eight cquiprobable messages, which are assigned QAM signals s ^, $ 2 , ... ? sg, 
as shown in Fig. P 10,6-6. 

(a) Find the optimum receiver for an AWGN channel, 

(h) Determine the decision regions and the error probability P € ^ of the optimum receiver as 
a function of E b . 


Figure 

P, 10.6-6 

|*- a —| 




I— * J L * is 

r 




L 




10.6- 7 Prove that for E b /A r > 1 and M > 2, the error probability approximation of Eq. (10.109b) 

for MPSK holds. 

10.6- S Use the appro x i m ati o n of Eq r (10 r 109 b) for 16 PS K to c om pare the sy mbol error probabi 1 i tie s of 

16-QAM and 16-PSK. Show approximately how many decibels of £^/A ,r (SNR) loss 16-PSK 
incurs versus 16-QAM (by ignoring the constant difference in front of the Q— function), 

10.6- 9 Compare the symbol error probabilities of 16-PAM, 16-PSK, and 16-QAM, Sketch them as 

functions of E b j AC 

10.6- 10 Show that for MPSK, the optimum receiver of the form in Fig, 10,19a is equivalent to a phase 

comparator. Assume all messages cquiprobable and an AWGN channel. 

10.6- 11 A ternary signaling has three signals for transmission: 

m G : 0 T m\ : 2p(t), m 2 : -2/j(f). 


(a) If P(m 0 ) = P{m\ ) = P(m2) - 1/3, determine the optimum decision regions and P ei \j of 
the optimum receiver as a function of E. Assume an AWGN channel. 

(b) Find P e t^ as a function of E/AC 

(c) Repeat parts (a) and (b) if P(m 0 ) = 1/2 and P{m\) - P(^ 2 ) = 0,25, 

10.6- 12 A 16-ary signal configuration is shown in Fig, PI0.6-12, Write the expression (do not eval¬ 

uate various integrals) for the P e ^ of the optimum receiver, assuming all symbols to be 
equiprobable. Assume an AWGN channel, 

10.6- 13 A five-signal configuration in a two-dimensional space is shown in Fig. PI0.6-13, 

(a) Choose the (0 = ^ J2jT 0 cos o> c t and (p 2 (t) = *J2jT 0 sin co c t and sketch the waveforms 
of the five signals. 




610 PERFORMANCE ANALYSIS OF DIGITAL COMMUNICATION SYSTEMS 


Figure 

P.10.6-12 



(b) Id the signal space, sketch the optimum decision regions, assuming an AWGN channel. 

(c) Determine the error probability P^ as a function of E of the optimum receiver. 


Figure 

P+10,6-13 


d 

2 


10+6-14 A 16-point QAM signal configuration is shown in Fig. P10.6-14. Assuming that all symbols are 
equiprobable, determine the error probability P e j^ as a function of E ^ of the optimum receiver 
for an AWGN channel. 

Compare the performance of this scheme with the result of rectangular 16-point QAM in 
Sec. 10.6. 


Figure 

P.10'6-14 
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10*7-1 The vertices of an AT-dimensional hypercube are a set of signals 

d * 

s k(0 = ^ L^ a kj<Pj(0 
J=l 

where {<p\ (;), ^(0* ■ j ^vWI is a set of N orthonormal signals, and a§ is either 1 or -L 

Note that all the N signals are at a distance of */Ndf2 from the origin and form the vertices of 
the N -dimensional cube. 

(a) Sketch the signal configuration in the signal space for iV = 1, 2, and 3. 

(b) For each configuration in part (a), sketch one possible set of waveforms. 

(c) If all the 2 n symbols are equiprobable, find the optimum receiver and determine the error 
probability P e m of the optimum receiver as a function of E b assuming an AWGN channel. 

10.7- 2 An orthogonal signal set is given by 

a k (t) = VMy k 0) k = 1,2 . N 

A biorthogonal signal set is formed from the orthogonal set by augmenting it with the negative 
of each signal. Thus, we add to the orthogonal set another set 

= —JE <Pk(0 

This gives 2N signals in an N- -dimensional space. Assuming all signals to be equiprobable 
and an AWGN channel, obtain the error probability of the optimum receiver How does the 
bandwidth of the biorthogonal set compare with that of the orthogonal set? 

10.8- 1 (a) What is the minimum energy equivalent signal set of a binary on-off signal set? 

(b) What is the minimum energy equivalent signal set of a binary FSK signal set? 

(c) Using geometrical signal space concepts, explain why the binary on-off and the binary 
orthogonal sets have identical error probabilities and why the binary polar energy 
requirements are 3 dB lower than those of the on-off or the orthogonal set. 

10.8- 2 Asource emits four equiprobable messages and encoded by signals $i (f), $2(0, 

$3(0, and $ 4 ( 0 , respectively, where 

S[ (0 = 20V5 sin 
$ 2 ( 0=0 
$ 3(0 = 10>/2cos 
$4(0 = ^10V5cos Y^t 

Each of these signal durations is 0 < t < and is zero outside this interval. The signals are 
transmitted over AWGN channels. 

(a) Represent these signals in a signal space. 

(b) Determine the decision regions. 

(c) Obtain an equivalent minimum energy signal set, 

(d) Determine the optimum receiver. 


= 


20 
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10.8- 3 A quaternary signaling scheme uses four waveforms, 

*l(0 = 4 foi(t) 

*2(0 = 2 <p) (f) + 2 <p2(t) 

*3(0 = -2 (0 - 2<e 2 (0 

*4(0 = <P20) 

where <p\ (f) and tfriif) are orthonormal basis signals. All the signals are equiprobable, and the 
channel noise is white Gaussian with PSD Sn(a>) = 10 -4 , 

(a) Represent these signals in the signal space, and determine the optimum decision regions. 

(b) Compute the error probability of the optimum receiver. 

(c) Find the minimum energy equivalent signal set, 

(d) Determine the amount of average energy reduction of the minimum energy equivalent 
signal set is transmitted. 

10.8- 4 An M = 4 orthogonal signaling system uses (0, y/E-tp^UX and JE ^(f) 

in its transmission. 

(a) Find the minimum energy equivalent signal set. 

(b) Sketch the minimum energy equivalent signal set in three-dimensional space. 

(c) Determine the amount of average energy reduction by using the minimum energy 
equivalent signal set. 

10.8- 5 A ternary signaling scheme (M = 3) uses the three waveforms: 

H {i) = [u{t)-u(t-T {} m 
*2(0 = K(0 -«{f - To) 

S l<t) = -\u(t-2T/3)-u(t-T Q y\ 

The transmission rate is I/To = 200 kilosymbols per second. All three messages are 
equiprobable, and the channel noise is white Gaussian with PSD S n (u>) = 2x 10 -6 , 

(a) Determine the decision regions of the optimum receiver. 

(b) Determine the minimum energy signal set and sketch the waveforms. 

(c) Compute the mean energies of the signal set and its minimum energy equivalent set, found 
in part (b), 

10*8-6 Repeat Prob, 10,8-5 ifP(mi) = 0.5, P(m 2 ) = 0.25, andP(m 3 ) = 0.25. 


10,8-7 A binary signaling scheme uses the two waveforms 


ft- 0.001 \ 

s\ it) = rect and 

V 0,002 / 


* 2(0 



r - 0,001 
0.002 


) 


(see Chapter 3 for the definitions of these signals). The signaling rate is 1000 pulses per 
second. Both signals are equally likely, and the channel noise is white Gaussian with PSD 
S„(w) = 2 x 10“ 4 . 

(a) Determine the minimum energy equivalent signal set. 

(b) Determine the error probability of the optimum receiver. 

(c) Use a suitable orthogonal signal space to represent these signals as vectors. 
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iojo-i 

10 JO-2 


lOJt-1 

10 J1-2 


Hint: Use Gram-Schmidtorthogonalization to determine the appropriate basis signals <p\ (t) 
and (0- 

In a binary transmission with messages m§ and , the costs are defined as 
Q)0 = C\] — 1 and C()] = Cjq = 4 
The two messages are equally likely. Determine the optimum Bayes receiver. 

In a binary transmission with messages wo and m \, the cost are defined as 


QiO = C\i= 0 and C 0 i = C 10 - C 

The probability of mo is 1/3 and the probability of mi is 2/3, 

(a) Determine the optimum Bayes receiver. 

(b) Determine the minimum probability of error receiver. 

(c) Determine the maximum likelihood receiver. 

(d) Compare the probability of error between the two receivers in parts (b) and (c). 

Plot and compare the probabilities of emir for the non coherent detection of binary ASK, binary 
FSK, and binary DPSK, 

Derive the probability of sy mbol error for different representations of QPSK signaling. 



I "I SPREAD SPECTRUM 
COMMUNICATIONS 


I n traditional digital communication systems, the design of baseband pulse-shaping and 
modulation techniques aims to minimize the amount of bandwidth consumed by the mod¬ 
ulated signal during transmission. This principal objective is clearly motivated by the desire 
to achieve good spectral efficiency and thus to conserve bandwidth resource. Nevertheless, 
a narrowband digital communication system exhibits two major weaknesses. First, its con¬ 
centrated spectrum makes it an easy target for detection and interception by unintended users 
(e,g., battlefield enemies and unauthorized eavesdroppers). Second, its narrow band, having 
very little redundancy, is more susceptible to jamming, since even a partial band jamming can 
ruin the signal reception. 

Spread spectrum technologies were initially developed for the military and intelligence 
communities to overcome the two aforementioned shortcomings against interception and jam¬ 
ming. The basic idea was to expand each user signal to occupy a much broader spectrum than 
necessary. For fixed transmission power, a broader spectrum means both lower signal power 
level and higher spectral redundancy. The low signal power level makes the communication 
signals difficult to detect and intercept, whereas high spectral redundancy makes the signals 
more resistant to partial band jamming, whether intentional or unintentional. 

There are two dominant spread spectrum technologies: frequency hopping spread spectrum 
(FHSS) and direct sequence spread spectrum (DSSS). In this chapter, we provide detailed 
descriptions on both systems. 


11.1 FREQUENCY HOPPING SPREAD SPECTRUM 
(FHSS) SYSTEMS 

The concept of frequency hopping spread spectrum (FHSS) is in fact quite simple and easy 
to understand. Each user can still use its conventional modulation. The only difference is that 
now the carrier frequency can vary over regular intervals. When each user can vary its carrier 
frequency according to a predetermined, pseudorandom pattern, its evasive signal effectively 
occupies a broader spectrum and becomes harder to intercept and jam. 

The implementation of an FHSS system is shown in Fig, 11.1. If we first ignore the two 
frequency converters, this system is no different from a simple digital communication system 
with an FSK modulator and a demodulator. The only difference in this FHSS system lies in 
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Figure 11.1 

Frequency 
hopping spread 
spectrum system. 



the carrier frequency hopping controlled at the transmitter by the pseudonoise (PN) generator 
To track the hopping carrier frequency, the receiver must utilize the same PN generator in 
synchronization with the transmitter PN generator. 

We note that most FHSS signals adopt binary or M-ary FSK modulations instead of the 
more efficient PAM, PSK, or QAM. The motivation for choosing FSK stems from its ability 
to utilize the less complex noncoherent detection. Tn contrast, coherent detection is generally 
needed for PAM, PSK, and QAM modulations. Due to the PN hopping pattern, coherent 
detection would require the receiver to maintain phase coherence with the transmitter at every 
one of the frequencies used in the hopping pattern. Such requirement would be difficult to satisfy 
during frequency hopping. On the other hand, FSK detection can be noncoherent without the 
need for carrier phase coherence and can be easily incorporated into FHSS systems* 

The frequency upconverter, as discussed in Example 4.2 of Chapter 4, can be a mixer or 
a multiplier followed by a bandpass filter. Denote T s as the symbol period. Then the M- ary 
FSK modulation signal can be written as 

^fsk ^ = ^ T 0 wj ) wiTs < t < (m -bl)Tj (11.1a) 

in which the A/-ary FSK angular frequencies are specified by 

, 1 A , 3 A M - 1 

— <*>c ± - Aw, ± - Aco, ... t a) c ± —(1 Mb) 

The frequency synthesizer output is constant for a period of T c often known as a “chip.” If we 
denote the frequency synthesizer output as in a given chip, then the FHSS signal is 

s m (t) = Acos [(w k + + (11.2) 

for the particular chip period T c . The frequency hopping pattern is controlled by the PN 
generator and typically looks like Fig. 11 *2* At the receiver, an identical PN generator enables 
the receiver to detect the FHSS signal within the correct frequency band (i.e*, the band the 
signal has hopped to). If the original FSK signal only has bandwidth B s Hz, then the FHSS 
signal will occupy a bandwidth L times larger 

B c = L B s 

This factor L is known as the spreading factor. 

For symbol period T s and chip period T t >, the corresponding symbol rate is R s = 1 jT s and 
the hopping rate is R c = 1 fT c . There are two types of frequency hopping in FHSS* If T c > T s , 
then the FH is known as slow T hopping. If 7^ < T s , it is known as fast FHSS, and there are 
multiple hops within each data symbol. In other words, under fast hopping, each data symbol 
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Figure 11.2 
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is spread across multiple frequency bands because of the fast hopping, and must be detected 
by detection over these frequency bands. 

One major advantage of FHSS lies in its ability to combat jamming. Suppose a jamming 
source has a finite level of jamming power Pj< Against a narrowband signal with band¬ 
width B s , the jamming source can transmit within B s at all time, creating an interference 
PSD level of Pj/B Hence, the signal-to-interference ratio (SIR) for the narrowband (NB) 
transmission is 



(11.3a) 


On the other hand, against FHSS signal with total bandwidth of B rr the jamming source must 
divide its limited power and will generate a much lower level of interference PSD with average 
value Pj fB c < As a result, at any given time, the signal bandwidth is still B s and the SIR is 



E h 

Pj/Bc 




(11.3b) 


Therefore, with a spreading factor of L, an FH signal is L times more resistant to a jamming 
signal with finite power than a narrowband transmission. Figure 11.3a and b illustrate the 
different effects of the finite jamming power on narrowband and FHSS signals. 

On the other hand, the jammer may decide to concentrate all its power Pj in a narrow signal 
bandwidth against FHSS. This will achieve partial band jamming. If the frequency hopping 
is slow, such that T c — T s , then on average one out of every L user symbols will encounter 
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the strong interference, as shown in Fig. 11.3c, Consider BFSK. We can assume a very strong 
interference such that the bits transmitted in the jammed frequency band have the worst BER 
of 0.5. Then, after averaging of the L bands, the total BER of this partially jammed FHSS 
svstem will be 



2L 


(11-4) 


Thus, the partially jammed FHSS signal detection has rather high BER under slow hopping. By 
employing a strong enough forward error correction (EEC) codes, to be discussed in Chapter 
14, such data errors can be corrected by the receiver. 


Example 11.1 Consider the case of a fast hopping system in which T c <«£ T s . There are L frequency bands for 
this FHSS system. Assume that a jamming source jams one of the L bands. Let the number of 
hops per T s be less than L and no frequency is repeated in each T s . Derive the BER performance 
of a fast hopping BFSK system under this partial jamming. 

With fast hopping, each user symbol hops over 

4 4 t s /T c Lh < L 

narrow bands. Hence on average, a user symbol will encounter partial jamming with a 
probability of 4/L. When a BFSK symbol does not encounter partial jamming during 
hopping, its BER remains unchanged. If a BFSK symbol does encounter partial band jam¬ 
ming, we can approximate its BER performance by discarding the energy in the jammed 
band. In other words, we can approximate the BFSK symbol performance under jamming 
by letting its useful signal energy be 

4-1 „ 


Thus, on average, the BFSK performance under fast hopping consists of statistical average 
7 of the two types of BFSK bits; 

In particular, when L » 1, fast hopping FHSS clearly achieves much better BER as 

i Pi “ j ('' m) exp (~i?) + ~2 {ik) “ p {-if ' ‘) = 7 xp {-w) 

In other words, by using fast hopping, the BER performance of FHSS under partial 
jamming approaches the BER without jamming. 
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11.2 MULTIPLE FHSS USER SYSTEMS 
AND PERFORMANCE 


Clearly, FHSS systems provide better security against potential enemy jammers or interceptors. 
Without full knowledge of the hopping pattern that has been established, adversaries cannot 
follow, eavesdrop on, or jam an FHSS user transmission. On the other hand, if an FHSS system 
has only one transmitter, then its use of the much larger bandwidth B c would be too wasteful. 
To improve the frequency efficiency of FHSS systems, multiple users may be admitted over 
the same frequency band B c with little performance loss* 

As shown in Fig. 11.4, each of the M users is assigned a unique PN hopping code that 
controls its frequency hopping pattern in FHSS. The codes can be chosen so that the users 
never or rarely collide in the spectrum with one another. With multiple users accessing the 
same L bands, spectral efficiency can be made equal to the original FSK signal without any 
loss of FHSS security advantages. Thus, multiple user access becomes possible by assigning 
these distinct hopping (spreading) codes to different users, leading to code division multiple 
access (CDMA). 

Generally, any overlapping of two or more user PN sequences would lead to signal collision 
in frequency bands where the PN sequence values happen to be identical during certain chips. 
Theoretically, well-designed hopping codes can prevent such user collisions* However, in 
practice, the lack of a common synchronization clock observable by all users means that each 
user exercises frequency hopping independently. Also, sometimes there are more than L active 
users gaining access to the FHSS system. Both cases lead to user collision. For slow and fast 
FHSS systems alike, such collision, would lead to significant increases in user detection errors. 


Performance of FHSS with Multiple User Access 

For any particular FHSS CDMA user, the collision problem would typically be limited to its 
partial band. In fact, the effect of such collisions is similar to the situation of partial band 
jamming, as analyzed next* 

Recall that the performance analysis of FSK systems has been discussed in Chapter 10 
(Sec. 10.7) under AWGN channels. It has been shown that the probability of symbol detection 
error for noncoherent AZ-aiy FSK signals is 

M — 1\ (—i ) c ~mE fj log? M /.■V(hi+ i) (11*5) 

m } m + 1 

For slow FHSS systems, each data symbol is transmitted using a fixed frequency carrier. 
Therefore, the detection error probability of slow FHSS system is identical to Eq. (11.5). 


P eM ~ 1 Pc M 


M -1 / 

-e( 

m=l 


Figure 11.4 
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In particular, the BER of the binary FSK system is shown to be [see Eq. (10,145) in See. 10,11] 

Ph = 

However, if two users transmit simultaneously in the same frequency band, a collision or a 
t£ hit” occurs. In this case we will assume that the probability of error is 0.5/ Thus the overall 
probability of bit error can be modeled as 


p b = A<r^AV (1 _ p h) + Y _ Ph (11 .6) 

where P: : is the probability of a hit, which we must determine. Consider random hopping. If 
there are L frequency slots, there is a 1/L probability that a given interferer will be present in 
the desired user’s slot. If there are M — 1 interferes or other users, the probability that at least 
one is present in the desired frequency slot is 


Ph = 1 




assuming L is large. Substituting this into Eq. (11,6) gives 


(11.7) 


Pb ~ 


X c -E»nx( x M-l\ itf^l 
2 V L ) 2 L 


( 11 - 8 ) 


If M = 1, the probability of error reduces to the BER of BFSK. KM ^ 1, by letting E b fN to 
approach infinity, we see that under random hopping, 


1M-1 1 

lim P b = --= -P k 

E b /Ar^oo 2 L 2 


(11.9) 


which illustrates the irreducible floor of the detected bit em>r rate due to multiple access 
interference (MAI). It is therefore important to design hopping patterns to reduce with 
multiple users. 


Asynchronous FHSS 

The previous analysis assumes that all users hop their carrier frequencies in synchronization. 
This is known as slotted frequency hopping. Such kind of time slotting is easy to maintain 
if distances between all transmitter-receiver pairs are essentially the same. This may not be 
a realistic scenario for many FHSS systems. Even when synchronization can be achieved 
between individual user clocks, different transmission paths will not arrive synchronously due 
to the various propagation delays, A simple development for asynchronous performance can 
be shown following the approach of Geronoitis and Pursley, 1 which shows that the probability 
of a hit in the asynchronous case is 


Fa = 1 



( 11 , 10 ) 


This is actually pessimistic, since studies have shown that this value can be lower. 
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where is the number of bits per hop. Comparing Eqs. (11.7) and (11.10) we see that, for the 
asynchronous case, the probability of a hit is increased, as expected. By using Eq. (1L10) in 
Eq. (11.6), we obtain the probability of error for the asynchronous case as 


L A N h )\ 2 } L L\ N b J\ 


( 11 . 11 ) 


As in the case of partial band jamming, the BER of the FHSS users decreases as the spreading 
factor increases. Additionally, by incorporating a sufficiently strong FEC at the transmitter 
code, the FHSS CDMA users can accommodate most of the collisions. 


Example 1 1.2 Consider an AWCN channel with noise level ,V = 10 -11 . A user signal is a binary FSK 
modulation of data rate 16 kbit/s that occupies a bandwidth of 20 kHz. The received signal 
power is -20 dBm. An enemy has a jamming source that can jam either a narrowband or a 
broadband signal. The jamming power is finite such that the total received jamming signal 
power is at most —26dBm. Use a spreading factor L = 20 to determine the approximate 
improvement of signal-to-noise ratio for the FHSS system under jamming. 


7 Since P s = —20dBm =10 s W and 7^ = 1 /16,000, the energy per bit equals 


Ek — ■ Tfr — 


1.6 x 10 9 


On the other hand, the noise level is A r — 10 -11 . Let the jamming signal have Gaussian 
distribution. The jamming power level equals Pj = —26dBm = 4x 10'" 6 W. 

When jamming occurs over the narrow band of 20 kHz, the power level of the 
interference is 



--- = 2x 10“ l ° 

20,000 Hz 


Thus, the resulting signal-to-noise ratio is 


E h 

Ei + A 


( 1.6 x 10 9 )“ ] 

2 x 10- 10 + 10- 10 


4.74 dB 


If the jamming must cover the entire spread spectrum L times wider, then the power level 
of the interference becomes 20 times weaker; 



Pj 

400,000 Hz 


1 X KT 11 


Thus, the resulting signal-to-noise ratio in this case is 


E t> _(1.6xl0 9 ) _i 
J» + jV "" 10- J1 4- 10 _li 


14.95 dB 


The improvement of SNR is approximately 10 dB. 
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11.3 APPLICATIONS OF FHSS 

FHSS has been adopted in several practical applications. The most notable ones among them 
are the wireless local area network (WLAN) standard for Wi-Fi, known as the IEEE 802.11 2 
and the wireless personal area network (WPAN) standard of Bluetooth. 


From IEEE 802.11 to Bluetooth 

IEEE 802,11 was the first Wi-Fi standard initially released in 1997, With data rate limited to 
2 Mbit/s, 802.11 only had very limited deployment before 1999, when the release and much 
broader adoption of IEEE 802.1 la and 802.11b removed the FHSS option. Now virtually 
obsolete, IEEE 802.11 was miraculously revived in the highly successful commercial product 
sold as Bluetooth . 3 Bluetooth differs from Wi-Fi in that Wi-Fi systems are required to provide 
higher throughput and covers greater distances. 4 Wi-Fi can also be more costly and consumes 
more power. 

Bluetooth, on the other hand, is an ultra-short-range communieation system used in elec¬ 
tronic products such as cellphones, computers, automobiles, modems, headsets, and appliances. 
Replacing line-of-sight infrared, Bluetooth can be used when two or more devices are in prox¬ 
imity to each other. It does not require high bandwidth. Because Bluetooth is basically the 
same as the IEEE 802,11 frequency hopping (FH) option, we only need to describe its details. 

The protocol operates in the license-free industrial, scientific, and medical (ISM) band of 
2.4 to 2.4835 GHz, To avoid interfering w ith other devices and networks in the ISM band, the 
Bluetooth protocol divides the band into 79 channels of 1 MHz bandwidth and executes (slow) 
frequency hopping at a rate of up to 1600 Hz. Two Bluetooth devices synchronize frequency 
hopping by communicating in a master-slave mode relationship. A network group of up to 
eight devices form a piconet, which has one master, A slave node of one piconet can be the 
master of another piconet. Relationships between master and slave nodes in piconets are shown 
in Fig. 11.5. A master Bluetooth device can communicate with up to seven active devices. At 
any time. The master device can bring into active status up to 255 further inactive, or parked, 
devices. One special feature of Bluetooth is its ability to implement adaptive frequency hopping 
(AFH). This adaptivity is built in to allow Bluetooth devices to avoid crowded frequencies in 
the hopping sequence. 

The modulation of the (basic rate) Bluetooth signal is shown in Fig. 11.6. The binary signal 
is transmitted by means of Gaussian pulse shaping on the FSK modulation signal. As shown 


Figure 11.5 
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Figure 11,6 
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TABLE 11.1 

Major Specifications of 802.11 FHSS and Bluetooth. 



802.11 FHSS 


Bluetooth (basic rate] 

Frequency band 

ISM ( 

'2.4—2.4835 GHz) 

Duplex format 


TDD 


Single-channel bandwidth 


1 MHz 

Number of nonoverlapping channels 


79 


BT S product 


0.5 


Minimum hopping distance 


6 


Modulation 

GFSK-2 and GFSK-4 

GFSK-2 

Data rate 

1 Mbit/s and 2 Mbit/s 

723.1 kbit/s 

Hopping rate 

2,5-160 Hz 


1600 Hz 


in Fig. 11.6, a simple binary FSK replaces the Gaussian low-pass filler with a direct path. The 
inclusion of the Gaussian low-pass filter generates what is known as the Gaussian FSK (or 
GFSK) signal. GFSK is a continuous phase FSK. It achieves better bandwidth efficiency by 
enforcing phase continuity. Better spectral efficiency is also achieved through partial response 
signaling (PRS) in GFSK. The Gaussian filter response stretches each bit over multiple symbol 
periods. 

More specifically, the Gaussian LPF impulse response is ideally given by 


h(r) = 


V2ttg 


Vb2 

2ttB 


where B is the 3 dB bandwidth of the Gaussian low-pass filter. Because this response is 
noncausal, the practical implementation truncates the filter response to 47/ seconds. This way, 
each bit of information is extended over a window 3 times broader than the bit duration T s . 

Note that the selection ofB is determined by the symbol rate 1/7/. In 802.11 and Bluetooth, 
B = 0.5/7/ is selected. The FM modulation index must be between 0.28 and 0.35. The GFSK 
symbol rate is always 1 MHz; binary FSK and four-level FSK can be implemented as GFSK-2 
and GFSK-4, achieving data throughput of 1 and 2 Mbit/s, respectively. Table 11.1 summarizes 
the key parameters and differences in IEEE 802.11 and the Bluetooth {basic rate). 

We note that our discussions on Bluetooth have focused on the (basic rate) versions 1.1 
and 1.2. More recently, version 2 of Bluetooth has been released. 4 Version 2.0 implementations 
feature Bluetooth Enhanced Data Rate (EDR) and reach 2.1 Mbit/s. Technically, version 2.0 
devices retain the FHSS feature but resort to the more efficient (differential) PSK modulations. 


SINCGARS 

S1NCGARS stands for single channel ground and airborne radio system. Tt represents a fam¬ 
ily of VHF-FM combat radios used by the U.S. military. First produced by ITT in 1983, 
SINCGARS transmits voice with FM and data with binary CPFSK at 16 kbit/s, occupying a 
bandwidth of 25 kHz, There can be as many as 2320 channels within the operational band of 
30 to 87,975 MHz, 
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To combat jamming, SINCGARS radios can implement frequency hopping at the rather 
slow rate of 100 Hz, Because the hopping rate is quite slow, SINCGARS is no longer effective 
against modern jamming devices. For this reason, SINCGARS is being replaced by the newer 
and more versatile JTRS (joint tactical radio system). 


From Hollywood to CDMA 

Like many good ideas, the concept of frequency hopping also had multiple claims of inventors. 
One such patent that gained little attention was awarded to Willem Broertjes of Amsterdam, 
Netherlands, in August 1932 (U.S. Patent no. 1,869,659). 5 However, the most intriguing patent 
on frequency hopping came from one of Hollywood's well-known actresses during World 
War II, Hedy Lamarr. In 1942 she and her coinventor George Antheil (an eccentric composer) 
were awarded U.S. patent no. 2,292387 for their “Secret Communications System.” The patent 
was designed to make radio-guided torpedoes harder to detect or to jam. Largely because of the 
Holly wood connection, Hedy Lamarrbecame a legendary figure in the wireless communication 
community, often credited as the inventor of CDMA, whereas other less glamorous figures 
such as Willem Broertjes have been largely forgotten. 

Hedy Lamarr was a major movie star of her time. 6 Bom Hedwig Eva Maria Kiesler in 
Vienna, Austria, she first gained fame in the 1933 Austrian film Ecstasy for some shots that 
were highly unconventional in those days. In 1937, escaping the Nazis and her first husband (a 
Nazi arms dealer), she went to London, where she met Louis Burt Mayer, cofounder and boss 
of the MGM studio. Mayer helped the Austrian actress’s Hollywood career by giving her a 
movie contract and a new name—Hedy Lamarr. Lamarr starred with famous colleagues such 
as Clark Gable, Spencer Tracy, and Judy Garland, appearing in more than a dozen films during 
her film career. 

Clearly gifted scientifically, Hedy Lamarr worked with George Antheil, a classical com¬ 
poser, to help the war effort. They originated an idea of a sophisticated antijamming device for 
use in radio-controlled torpedoes. Jn August 1942, under her married name at the time, Hedy 
Kiesler Markey, Hedy Lamarr was awarded U.S. Patent no. 2,292,387 (Fig. 11.7), together with 
George Antheil. They donated the patent as their contribution to the war effort. Drawing inspi¬ 
ration from the composer's piano, their invention of frequency hopping uses 88 frequencies, 
one for each note on a piano keyboard. 

However, the invention would not be implemented during World War II. It was simply 
too difficult to pack vacuum tube electronics into a torpedo. The idea of frequency hopping, 
nevertheless, became reality 20 years later during the 1962 Cuban missile crisis, when the 
system was installed on ships sent to block communications to and from Cuba. Ironically, 
by then, the Lamarr-Antheil patent had expired. The idea of frequency hopping, or more 
broadly, the idea of spread spectrum, has since been extensively used in military and civilian 
communications, including cellular phones, wireless LAN, Bluetooth, and numerous other 
wireless communications systems. 

Only in recent years has Hedy Lamarr started receiving a new kind of recognition as a 
celebrity inventor. In 1997 Hedy Lamarr and George Antheil received the Electronic Frontier 
Foundation (EFF) Pioneer Award. Furthermore, in August 1997, Lamarr was honored with the 
prized BULB IE Gnass Spirit of Achievement Bronze Award {the “Oscar 11 of inventing). If she 
had won an Academy Award for her film works in film, she would have been the only person 
to receive two entirely different “Oscar” awards! Still, inventors around the world are truly 
delighted to welcome a famous movie celebrity into their ranks. 

Inventor Hedy Kiesler Markey died in 2000 at the age of 86. 
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Figure 11.7 
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11.4 DIRECT SEQUENCE SPREAD SPECTRUM 

FHSS systems exhibit some important advantages, including low-complexity transceivers and 
resistance to jamming. However, the difficulty of carrier synchronization under frequency 
hopping means that only noncoherent demodulations for FSK and DPSK are actually practical. 
A s sh own i n the an aly sis fro m S ec. 10.11, FS K an d DPS K ten d to have poorer B ER performa n ce 
(power efficiency) and poorer bandwidth efficiency compared with QAM systems, which 
require coherent detection. Furthermore, its susceptibility to collision makes FHSS a less 
effective technology for CDMA. As modem communication systems have demonstrated, direct 
sequence spread spectrum (DSSS) systems are much more efficient in bandwidth and power 
utilization. 7 Today, DSSS has become the dominant CDMA technology in advanced wireless 
communication systems. It is not an exaggeration to state that DSSS and CDMA are almost 
synonymous. 


Optimum Detection of DSSS PSK 

Direct sequence spread spectrum is a technology that is more suitable for integration with 
bandwidth-efficient linear modulations such as QAM/PSK. Although there are several different 
ways to view DSSS, its key operation of spectrum spreading is achieved by a PN sequence, 
also known as the PN code or PN chip. The PN sequence is mostly binary, consisting of Is 
and Os, which are represented by polar signaling of +1 and —1. To minimize interference 
and to facilitate chip synchronization, the PN sequence has some nice autocorrelation and 
cross-correlation properties. 

Direct sequence spread spectrum (DSSS) expands the traditional narrowband signal by 
utilizing a spreading signal c(0- As shown in Fig, 11.8, the original data signal is linearly 



Figure 11.8 

DSSS system. 


1 1 A Direct Sequence Spread Spectrum 625 



Figure 11*9 
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modulated into a QAM signal £q am {/). Instead of transmitting this signal directly over its 
required bandwidth, DSSS modifies the QAM signal by multiplying the spreading chip sig¬ 
nal c(t) with the QAM narrowband signal. Although the signal carrier frequency remains 
unchanged at o) Ct the new signal after spreading becomes 

s m U) — s qam (t)c(t) (11.12) 

Hence, the transmitted signal Jds( 0 is a product of two signals whose spread band¬ 
width is equal to the bandwidth sum of of the QAM signal Jqam( 0 and the spreading 
signal c(t). 


PN Sequence Generation 

A good PN sequence c{r) is characterized by an autocorrelation that is similar to that of 
a white noise. This means that the autocorrelation function of a PN sequence should be 
high near r =0 and low for all r ^0, as shown in Fig. 11.9a. Moreover, in CDMA appli¬ 
cations several users share the same band using different PN sequences. Hence, it is necessary 
that the cross-correlation among different pairs of PN sequences be small to reduce mutual 
interference* 

A PN code is periodic. A digital shift register circuit with output feedback can gen¬ 
erate a sequence wdth long period and low susceptibility to structural identification by an 
outsider. The most wddely known binary PN sequences are the maximum length shift reg¬ 
ister sequences (m-sequences). Such a sequence, which can be generated by an m-stage 
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shift register with suitable feedback connection, has a length L — 2 m — 1 bits, the maxi- 
mum period for such a finite state machine. Figure 11.9b shows a shift register encoder for 
m = 6 and L — 63. For such “short” PN sequences, the autocorrelation function is nearly an 
impulse and is periodic 


R r (T) = j a '<:mc(t + z)d, = ^ oi.isj 

As a matter of terminology, a DSSS spreading code is a short code if the PN sequence 
period equals the data symbol period T„. A DSSS spreading code is a long code if the PN 
sequence period is a (typically large) multiple of the data symbol period. 


Single-User DSSS Analysis 

The simplest analysis of DSSS system can be based on Fig. 11.8. To achieve spread spectrum, 
the chip signal c(t) typically varies much faster than the QAM symbols. As shown in Fig. 11,8, 
there are multiple chips of ±1 within each symbol duration of T,. Denote the spreading 
factor 


L = T s jT c T, = chip period 

Then the spread signal spectrum is essentially L times broader than the original modulation 
spectrum 


B c = (L+ 1)5, ^ L ■ B, 


Note that the spreading signal c(f) = ± 1 at any given instant. Given the polar nature of the 
binary chip signal, the receiver, under an AWGN channel, can easily “despread” the received 
signal 


y{t) = sds( 0 + n(f) = SQAM(0t(0 + n(f) (11.14) 

by multiplying the chip signal with the received signal 

rif) = c{t)y{t) 

= ■sqamO)c 2 (() + n(t)e(t) 

= s QAM (t) + n(t)c(t) 

\(t) 

Thus, this multiplication allows the receiver to successfully “despread” the spread spectrum 
signal, The analysis of the DSSS receiver depends on the characteristics of the noise \{t). 
Because c(t) is deterministic, and n i r i is Gaussian with zero mean, \(t) remains Gaussian 
with zero mean. As a result, the receiver performance analysis requires finding only the PSD 
of x(f). 


(11.15) 

(11.16) 
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To determine the power spectral density of the “despread” noise x(r) = n(f)c(f)> we can 
start from the definition of PSD (Sec, 9.3): 


c ,,, r IXr(/)r 

S x (f) = lim --- 

T —oo T 


= lim 

7—*oo 


l crp. fT/2 

- / / x(ii)xU 2 )e-j 2!t f^--^dtidt 2 (11.17a) 

I J-T/2 J-T/2 


l fT/2 fT/2 _ _ 

lim — / / x(f])x(f 2 V u) dt\dt 2 

T^oo T J-jf2 7-7/2 


| fT/2 fT/2 _ 

= lim - / } c{t])c(t 2 )n{t\)n(t 2 )e- j2Kf{n ~ n) dt\dti 

T-*oc 1 J-T/2 J-T/2 

= lim ^ f f c(ti)c(t 2 )R n (ti - ti)c~ j27Tfi,1 ~ 1]) dt\ dt 2 

1 J^y/2 7 — 7/2 


(lL17b) 


Recall that 


R„ { I 2 -t l) 


/ OO 

S n (.v)e’ 2 

-oc 


*v{<2-h) dv 


We therefore have 

| foo fT/2 fT j2 

S\if)= lim - / c(t l )c(t 2 )S n ( V )e-^ ( ^ m ^Ut l dt 2 dv (11.18a) 

T /_oo J-t/2 J-T/2 


i fT/2 fT/2 
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1 C T P- 
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S„(v) lim --- dv 

7—>O0 7 


(11.18b) 


The last equality comes from the definition of PSD for c(r). Equation (1L1S) illustrates the 
dependency of the detector noise PSD on the chip signal c(t)^ As long as the PN sequence is 
almost orthogonal such that it satisfies Eq + (1L13), then 

R c (t)^LT c -^S{t- i-LT e ) (11.19a) 

i 

S c (f)*LT c • ]£&(f-k/LT c ) = - k/LT c ) (11.19b) 

c k k 
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and 


W) = £s n (/-*/Wr) (11.20) 

k 

In other words, as long as the chip sequence is approximately orthogonal, the noise at the QAM 
detector remains a white Gaussian with zero mean. For practical reasons, the white noise u(t) 
is filtered at the receiver to be band-limited to 1 /2T r . As a result, the noise spectrum after the 
despreader still is 


W)4 (11.21) 

In other words, the spectral level also remains unchanged. Thus, the performance analysis 
carried out for coherent QAM and PSK detections in Chapter 10 can be applied directly. 

In Sec. 10,6 we showed that for a channel with (white) noise of PSD M/2, the error 
probability of optimum receiver for polar signaling is given by 


Pb = Q 



( 11 . 22 ) 


where Eb is the energy per bit (energy of one pulse). This result demonstrates that the error 
probability of an optimum receiver is unchanged regardless of whether or not we use DSSS. 
While this result appears to besomewhatsurprising,infact,itisquite consistent with theAWGN 
analysis. For single user, the only change in DSSS lies in the spreading of transmissions over a 
broader spectrum by effectively using a new pulse shape c(t). Hence, the modulation remains 
QAM whereas the channel remains AWGN. Consequently, the coherent detection analysis of 
Sec. 10.6 is fully applicable to DSSS signals. 


11 .5 RESILIENT FEATURES OF DSSS 

As in FHSS, DSSS systems provide better security against potential jamming or interception 
by spreading the overall signal energy over a bandwidth L times broader. First, its low power 
level is difficult for interceptors to detect. Furthermore, without the precise knowledge of the 
user spreading code [or adversaries cannot despread and recover the baseband QAM 
signal effectively. In addition, partial band jamming signals interfere with only a portion of 
the signal energy. They do not block out the entire signal spectrum and are hence not effective 
against DSSS signals. 

To analyze the effect of partial band jamming, consider an interference i(t) that impinges 
on the receiver to yield 


v(0 = Jqam(0c( 0 + i(f) 

Let the interference bandwidth be After despreading, the output signal plus interference 
becomes 


y(t)c{t) = sqam(0 -F i(r)c(t) 


(11.23) 
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It is important to observe that the interference term has a new frequency respouse because of 
despreading by c(t) 

4(0 = <=> /(/) * C(f) (11.24) 

which has approximate bandwidth B c + = LB S + /f<\ 

DSSS Analysis against Narrowband Jammers 

If the interference has the same bandwidth as the QAM signal B s , then the “despread” inter¬ 
ference 4(0 will now ? have bandwidth equal to (L 4- i)B s . In other words, the narrowband 
interference i(t) will in fact be spread L times larger by the '‘despreading” signal c(t). 

If the narrowband interference has total power P,- and bandwidth B^, then the original 
interference spectral level before despreading is 

W) = ^ / e ifc ~ 0,5 B,J C + 0.5BJ 

After despreading, the spectrum of the interference 4(0 becomes 

W) = „ / e [f c - 0 .S(L 4 l)B St /, 4- 0.5{L 4- 1) B s ] 

Because of the despreading operation, the narrowband interference is only 1 f(L 4- 1) the 
original spectral strength. Note that the desired QAM signal still has its original bandwidth 
(ru c - ttBs, co c 4 nB s ), Hence, against narrowband interferences, despreading can reduce the 
signal-to-interference ratio (SIR) by a factor of 

Eb 

- ^4^ - = L + 1 (11.25) 

t-b 

Pi/(L+l)B t 

This result illustrates that DSSS is very effective against narrowband (partial band) jamming 
signals. It effectively improves the SIR by the spreading factor. The “spreading” effect of the 
despreader on a narrowband interference signal is illustrated in Fig. 11.10. 

The ability of DSSS to combat narrowband jamming also means that a narrowband com¬ 
munication signal can coexist with DSSS signals. The SIR analysis and Fig. 11.10 already 
established the resistance of DSSS signals to narrowband interferers. Conversely, if a nar¬ 
rowband signal must be demodulated in the presence of a DSSS signal, then the narrowband 


Figure 11,10 

Narrowband 
interference 
mitigation by 
the DSSS 
despreader 
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Figure 11.11 

Equivalent 
baseband 
diagram of 
DSSS system. 


Data 



signal can also be extracted with little interference from the DSSS signal by replacing the 
despreader with a narrow bandpass filter. Tn this case, the roles of signal and interference are 
in fact reversed. 

DSSS Analysis against Broadband Jammers 

In many cases, interferences come from broadband sources that are not generated from the 
DSSS spreading approach. Against such interferences, the despreading operation only mildly 
broadens and weakens the interference spectrum. 

Let the interference be broadband with the same bandwidth LB S as the spread signal. 
Based on Eq. (11,24), the interference after despreading would be i a (t), which has bandwidth 
of 2 LB S . In other words, broadband interference i(f) will in fact be expanded to a spectrum 
nearly twice as wide and half as strong in intensity. From this discussion, we can see that a 
DSSS signal is most effective against narrowband interferences and not as effective against 
broadband interferences. 


11.6 CODE DIVISION MULTIPLE-ACCESS (CDMA) 

OF DSSS 

The RF diagram of a DSSS system can be equivalently represented by the baseband diagram of 
Fig* 11.11, which provides a new perspective on the DSSS system that is amenable to analysis* 
Let the (complex-valued) QAM data symbol be 

U - a k +jb k (k-l)Ts <t <kTs (11.26) 

Then it is clear from the PN chip sequence that the baseband signal after spreading is 

*c(f) = (a k +jb k ) ■ c(t) (k - i)T, < t < kT s (11*27) 

In other words, the symbol s k is using 

c(t) (k — 1 )F, < t < kT s 

as its pulse shape for transmission. Consequently, at the receiver, the optimum receiver would 
require c(0 to be used as a correlator receiver (or, equivalently, a matched filter). As evident 
from the diagram of Fig, 1L11 f the despreader serves precisely the function of the optimum 
matched filter (or correlator receiver)* Such a receiver is known as a conventional single-user 
optimum receiver. 
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Figure 11.12 

A code division 
mulH pie-access 
(CDMA) system 
based on DSSS. 



We have shown that DSSS systems enjoy advantages against the threat of narrowband 
jamming and attempts at interception. However, if a DSSS system has only one signal to 
transmit, then its use of the larger bandwidth B c would be too wasteful. Just as in FHSS, CDMA 
of DSSS can be achieved by letting multiple users, each given a distinct PN spreading signal 
c,(f), access the broad bandwidth of LB S simultaneously. Such a multiple-access system with 
M users based on CDMA is shown in Fig* 11.12. Each user can apply a single-user optimum 
receiver. 

Because these CDMA users will be transmitting without time division or frequency divi¬ 
sion, multiple-access interference (MAI) exists at each of the receivers. To analyze a DSSS 
system with M multiple-access users, we compute the interference at the output of a given 
receiver caused by the remaining M - 1 users* Jt is simpler to focus on the time interval 
[(£ — 1)7^ kTs] and the kih symbol of all M users* In Fig. 11*12, we have made the multiple 
assumptions for analytical simplicity. Here we state them explicitly: 

♦ The ith user transmits one symbol over the interval [(k — 1 )T 5 , kT^\. 

■ There is no relative delay among M users, and each receiver receives the kth symbol of all 
M users within l(& — l)7’ i , kT^]. 

■ All user symbols have unit power; that is, £{|s^| 2 } = 1. 

■ The ith user’s transmission power is P,* 

♦ The /th user channel has a scalar gain of gy* 

♦ The channel is AWGN with noise n(r)* 

The first two assumptions indicate that all M users are synchronous. While asynchronous 
CDMA systems are commonplace in practice, their analysis is a straightforward but nontrivial 
generalization of the synchronous case.* 

Because all users share the same bandwidth, every receiver will have equal access to the 
same channel output signal 


M 

y{t) = + n (0 


1 


(11.28a) 


* In asynchronous CDMA analysis, the analysis window must be enlarged to translate it into a nearly equivalent 
synchronous CDMA case with many more equivalent users ,** 1 y 
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After application of the matched filter (despreading), the itb receiver output at the sampling 
instant t = kT s is 

pk-r, 

rr = / c-(0.v(0 dt 

" fkT, f kT s 

= f a(t)Cj(t)dt + / Ci(t)n(t)dt 

j= . i J(k-l)T s J(k-l)T, 

M 

- + n ^) (11 *28b) 

j'=i 

For Relational convenience, we have defined the (time-varying) cross-correlation coefficient 
between two spreading codes as 


fkT, 

Xijik) - / ci(t)cj{t)dt (11.28c) 

■'(*-] )T S 

and the ith receiver noise sample as 

rkT, 

n { (k) = / Ci(t)n(t)dt (ll,28d) 

J(k-DT S 

Tt is important to note that the noise samples of Eq t (1 1.28d) are Gaussian with mean 

__ fkT s _ 

Tii(k) = f Cj(t)n(t) dt = 0 
J(k-])T S 

The cross-correlation between two noise samples can be found as 


_ pkT s 

n,(A)n/{f) = / f Ci(ti)Cj(t 2 )n(ti)n(t 2 )dt}dt 2 

J(,k — ])T S J((-l)T s 

fkT s ptT x 

= / / <:iUi ')Cj(t2)R n (t2 - t\) di\ dt 2 

J(k—\)T s d(f-])r, 

f kT ' f iT > j\r 

= I I - t\)dt\ dt2 

J(k-l)T s J(i-\)T S 2 

.V f kTs 

= -S[k-e]j Ci(t\)cj(t]) dti 

2 J(k-l)T s 

= ^rRijikm - t] 


(11.29a) 

(11.29b) 


Equation (11.29) shows that the noise samples at the DSSS CDMA receiver are temporally 
white. This means that the Gaussian noise samples at different sampling time instants are 
independent of one another Therefore, the optimum detection of (s^J can be based on the 
samples (r^ J ) at time f = kT. 

For short-code CDMA, {q( 0} are periodic and the period equals 7V In other words, the 
PN spreading signals (c r (f)} are identical over each period l(£ - 1 )T^kT s }. Therefore, in 
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short code CDMA systems, the cross-correlation coefficient between two spreading codes is a 
constant 


Ri.j(k) = Rij 


(11.30) 


Note that the decision variable of the ith receiver is 

M 

if = giJPiRuWs™ + J2^^ Ri ^ k)s k } +M*) 01.31) 


The term /f is an additional term resulting from the multiple-access interference of the M - 1 
interfering signals. When the spreading codes are selected to satisfy the orthogonality condition 


Rij(jc) = 0 i^j 


then the CDMA multiple-access interference is zero, and each CDMA user obtains performance 
identical to that of the single DSSS user or a single baseband QAM user. 

There are various ways to generate orthogonal spreading codes. Walsh-Hadamard codes 
are the best-known orthogonal spreading codes. Given a code length of L identical to the 
spreading factor, there are a total of L orthogonal Walsh-Hadamard codes. A simple example 
of the Walsh-Hadamard code for L = 8 is given here. Each row in the matrix of Eq. (11.32) is 
a spreading code of length 8: 
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-i 

-i 

-1 
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+ 1 

-1 

-i 

+ 1 

-1 

+i 

-hi 

-1 


At the next level, WaJsh-Hadamard code has length 16, which can be obtained from W# via 


W 2 *-i W 2 *_ 

W 2 *_ e -W$- 


In fact, starting from W\ = [ 1 ] wdth k = 0, this recursion can be used to generate length 
L = 2* Walsh-Hadamard codes. 


Gaussian Approximation of Nonorthogonal MAI 

Tn practical applications, many user spreading codes are not fully orthogonal. As a result, the 
effect of MAI on user detection performance may be serious. To analyze the effect of MAI on a 
single-user receiver, we need to study the MAI probability distribution. The exact probability 
analysis of 4 is difficult. An alternative is to use a good approximation. When M is large, one 
may invoke the central limit theorem to approximate the MAI as a Gaussian random variable. 
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Recall that the QAM symbols are independent with zero mean and unit variance, that is. 


$=° 


k s k 

’* = o 

i ¥=j 

S k 

i2 ^ 



Hence, we can approximate the MAI as Gaussian with mean 

_ M _ 


and variance 



M 

Eki 2p J IWl 2 




(11.33) 


(11-34) 


The effect of this MAI approximation is a strengthened channel noise. Effectively, the 
performance of detection based on decision variable r^ is degraded by the additional Gaussian 
MAL Based on single-user analysis, the new equivalent SNR is degraded and becomes 

_ 2 E b _ 

E b {\gi\ 2 Pi \PiAk)\ 2 )~ ] l,^i\8j\ 2 Pj Kf{*)| 2 +-V 

For the special case of BPSK or polar signaling, the BER of the ith CDMA user is approximately 


Q 


_2 Eb _ 

Eb (\g t \ 2 Pi |*i.i <*)| 2 )" 1 £&,- Iftf Pj \PiAk)\ 2 +jVj 


(11.35) 


Observe that when a single user is present (M = 1), Eq. (1135) becomes the well-known 
polar BER result of 


'*-<<£) 

as expected. The same result is also true when all spreading codes are mutually orthogonal 
such that Rijik) = 0, * j. 

In the extreme ease of noise-free systems, when the signulrto-noise ratio is very high 
(E b f oo), we obtain 


lim P b = Q 

El>/A' * OO 


\ gi \ 2 Pj |%(*)| 3 

T?M 2 Pj |*M*>| 2 


This shows the presence of an irreducible error floor for the MAI limited case. This noise floor 
vanishes when the spreading codes are mutually orthogonal such that R^{k) = 0 if i ^ j. 
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The Near-Far Problem 

The Gaussian approximation of the MAI has limitations when used to predict system perfor¬ 
mance. While the central limit theorem implies that/^ will tend toward a Gaussian distribution 
near the center of its distribution, convergence may require very large number of CDMA 
users M. In a typical CDMA system, the user number M is only in the order of 64 to 128* 
When M is not sufficiently large, the Gaussian approximation of the MAI may be highly 
inaccurate, particularly in a near-far environment. 

The so-called near-far environment describes the following scenario. 

* The desired transmitter is much farther away from its receivers than some interfering 
transmitters. 

* The spreading codes are not mutually orthogonal; that is, Rij(k) ^ 0 when i ^ /. 

If we assume identical user transmission power in all cases, (he., Pj — P G ), in the near-far 
environment the desired signal channel gain g, is much smaller than some interfered channel 
gains. In other words, there may exist some user set J such that 

Hi «£y J £ J (1136) 


As a result, Eq. (11.31) becomes 


r£° = JEgiRijWs? + + 

jeJ 

= + JK'Yjg.jRijitysf + n'(k) 

where we have defined an equivalent noise term 


\fp~0 Ei + ni(fc) 

JiJ 


(11.37) 


n'U) = JF 0 '£ i gjR i j(k)s k i) + n i (k) 

j£J 


(11.38) 


that is approximately Gaussian. 

In a near-far environment, it becomes likely that the smaller signal channel gain and the 
nonzero cross-correlation result in the domination of the (far) signal component 

by the strong (near) interference 

)€j 

The Gaussian approximation analysis of the BER in Eq. (11.35) no longer applies. 


Example 1 1.3 Consider a CDMA system with two users (M =2). Both signal transmission powers are 10 mW. 

The receiver for user 1 can receive signals from both user signals. To this receiver, the two 
signal channel gains are 


8l — 10 -4 82 — 10 -1 
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The spreading gain equals L = 128 such that 

R\ A (k) = 128 




-1 


The sampled noise is Gaussian with zero mean and variance of 10 -6 . Determine the 
BER for the desired user 1 signal. 


3 

4 




The receiver decision variable at time k is 
r k 

= 1CT 2 [o.l284 f) -4 2J + 


■ i<r* ■ 128 ■ + vIcH ■ lo -1 ■ (- 1 ) -sp +m(ft) 

* i-iooma)] 

For equally likely data symbols ±\ y the BER of user 1 is 

P h = 0.5P [r* > 014° = -l]+0.5-/> [r,t <0]4 l) = l] 

= p[r A <0|4 n = l] 


= P [o. 


128 -j* 2) + iOOni(it) < 0 




^ Because of the equally likely data symbol P ^ 2) = ±lj - 0.5, we can utilize the total 
% probability theorem to obtain 


& 

S 

'4 

$ 


P h = 0.5 P [o. 128 - 4 2) + 100m (i) < 0|4 2) = l] 

+ 0.5/> jo.128 -s< 2) + 100m (fe) < o|4 2) = -ij 

= 0.5P [0.128- 1 + 100m (k) < 0] 4- 0.5/ 3 [0.128 + 1 + 100m (*) < 0] 

= 0.5 P [100m (i) < 0.872] +0.5/* [100n, (t) < -1.128] 

= 0.5 [1 - Q (8.72)] +0.52(11.28) 

«0.5 

Thus, the BER of the desired signal is essentially 0.5, which means that the desired user 
is totally dominated by the interference in this particular near-far environment 


Power Control in CDMA 

Because the near-far problem is a direct result of difference in user signal powers at the 
receiver, one effective approach to overcome the near-far effect is to increase the power of 
the “far” users while decrease the power of the “near” users. This power balancing approach 
is known in CDMA as power control . 

Power control assumes that all receivers are collocated. For example, cellular communi¬ 
cations take place by connecting a number of mobile phones within each cell to a base station 
that serves the cell. All mobile phone transmissions within the cell are received and detected 
at the base station. The transmission from a mobile unit to the base station is known as the 
uplink or reverse link , as opposed to downlink or forward link when the base station transmits 
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to a mobile user. It is clear that the near-far effect does not occur during downlink. In fact, 
because multiple user transmissions can be perfectly synchronized, downlink CDMA can be 
easily made synchronous to maintain orthogonality. Also at each mobile receiver, all signal 
transmissions have equal channel gain because all originate from the same base station. Neither 
near-far condition can be satisfied. For this reason, CDMA mobile users in downlink do not 
require power control or other means to combat strong MAI. 

When CDMA is used on the uplink to enable multiple mobile users to transmit their signals 
to the base station, the near-far problem will often occur. By adopting power control, the base 
station can send instructions to the mobile phones to increase or to decrease their transmission 
powers. The goal is for all user signals to arrive at the base station receivers with similar 
power levels despite their different channel gains. In other words, a constant value of \gi\ 2 Pj 
is achieved because power control via receiver feedback provides instructions to the mobile 
transmitters. 

One of the major second-generation cellular standards, cdmaOne (also know n as IS-95), 
pioneered by Qualcomm, is a DSSS CDMA system. It applies power control to overcome the 
near-far problem at base station receivers. 

Power control Lakes two forms: open loop and closed loop , Under open-loop power control, 
a mobile station adjusts its power based on the strength of the signal it receives from the base 
station. This presumes that a reciprocal relationship exists between forward and reverse links, 
an assumption that may not hold if the links operate in different frequency bands. As a result, 
closed-loop pow'er control is often required because the base station can order the mobile 
station to change its transmitted power. 


Near-Far Resistance 

An important concept of near-far resistance was defined by S. Verdu. K1 The main objective 
is to determine whether a CDMA receiver can overcome the MAI by simply increasing the 
signal-to-noise ratio Eb/M. A receiver is defined as near-far resistant if, for every user in the 
CDMA system, there exists a nonzero y such that no matter how strong the interferences are, 
the probability of bit error P^ ] as a function of Eb/A r satisfies 


lim 


pf(E b /M) 

0 QWy • 2 E h / N ) 


-hoo 


This means that a near-far resistant receiver should have no BER floor as A r -> 0. Our 
analysis of the conventional matched filter receiver based even on Gaussian approximation has 
demonstrated the lack of near-far resistance by the conventional single-user receiver. Although 
power control alleviates the near-far effect, it does not make the conventional receiver near-far 
resistant. To achieve near-far resistance, we will need to apply multiuser detection receivers to 
jointly detect al I user symbols instead of approximating the sum of interferences as additional 
Gaussian noise. 


11.7 MULTIUSER DETECTION (MUD) 

Multiuser deteciou (MUD) is an alternative to power control as a tool against near-far effect. 
Unlike power control, MUD can equalize the received signal power without feedback from 
the receivers to the transmitters. Instead, MUD is a centralized receiver that aims to jointly 
detect all user signals despite the difference of the received signal strength. 
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For MUD, the general assumption is that the receiver has access to all M signal samples 
of Eq. (11.31). In addition, the receiver has knowledge of the following information: 

t. User signal strengths gi\fPi. 

2. Spreading sequence cross-correlation Rij(k). 

3. Statistics of the noise samples n,-{fc). 

To explain the different MUD receivers, it is more convenient to write Eq. (11.31) in vector 
form: 


*2,1<*) *2.2<*) 


*i,Af (k) 

*i„w(fc) 


*M,2(&) ■■■ RMMik) 

glyfRl 


We can define the vectors 

r _a> ■ 


SmVPm I if") 


nM(k) 


(11,39a) 


We can also define matrices 


nji/(*) 


*1,1 (*) 

Ruiik) 

■■■ R i,Ai 

* 2,1 (*) 

*2,2 (k) 

■■■ *l,Ai 

Rm, i (*) 

1 R\l'jk) 

*■ ■ Rm„ 

gn/Pl 

gl'fPl 



L J 

Then the M output signal samples available for MUD can be written as 

r* = Rk ■ D ■ s k + n* 


(11.39b) 


(11.39c) 


(11.39d) 


(11.39e) 


Notice that the noise vector n; is Gaussian with zero mean and covariance matrix [Eq. (11.29)] 

- f AC 

n*(n£) =T Rk Ul-40) 



1 1 .7 Multiuser Detection [MUD] 639 


The goal of MUD receivers is to determine the unknown user data vector s k based on the 
received signal vector value r* = r*. Based on the system model ofEq. (11.39), different joint 
MUD receivers can be derived according different criteria. 

To simplify our notation in MUD discussions, we denote A * as the conjugate of matrix 
A and A r as the transpose of matrix A, Moreover, we denote the conjugate transpose of 
matrix A as 


a h = <a*) t 

The conjugate transpose of a matrix is also known as its Hermitian. 

Optimum MUD: Maximum Likelihood Receiver 

The optimum MUD based on the signal model of Eq. (11.39) is the maximum likelihood 
detector (MLD) under the assumption of equally likely input symbols. As discussed in Sec. 10.6, 
the optimum receiver with minimum probability of symbol error is the MAP receiver 

s k = argmaxp(s A .|ri.) (11.41a) 


If all possible values of s k are equally likely, then the MAP detector reduces to the maximum 
likelihood detector (or MLD) 


s k = argmaxp (/■*!**) 


(11.41b) 


Because the noise vector n* is jointly Gaussian with zero mean and covariance matrix 0.5A^R k , 
we have 


= (^""[det^r'exp —^(rt-RkDsifR; 1 (r k -R k Ds k )* 

L j v 


(11.42) 


The MLD receiver can be implemented as 

maxp(r k \s k ) <==► min (r k - R k Ds k f R7 l (r k - R k Ds k )* 
Sk s k * 


min 

Si 


R k ~ l/2 in-R k Ds k ) 


2 


(11,43) 


The maximum likelihood MUD receiver is illustrated in Fig. 11.13. 

Thus, the maximum likelihood MUD receiver must calculate and compare the values of 

for all possible choices of the unknown user symbol vector s^. If each user uses 16-QAM to 
modulate its data, the complexity of this optimum MUD receiver requires 16 M evaluations 
of Eq. (11.43)* It is evident that the optimum maximum likelihood MUD has a rather high 
complexity. Indeed, the computational complexity increases exponentially with the number of 
CDMA users. 10 This is the price paid for this optimum and near-far resistant CDMA receiver . 10 
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Figure 11.13 

Maximum 
likelihood 
multiuser 
detection (MUD) 
receiver. 




Decorrelator Receiver 

The high complexity of the maximum likelihood MUD receiver reduces its attractiveness 
in practical applications* To bring down the computational cost, several low-complexity and 
suboptimum MUD receivers have been proposed. The decorrelator MUD is a linear method 
that simply uses matrix multiplication to remove the MAI among different users. Based on 
Eq* (II *39), the MAI among different users is caused by the nondiagonal correlation matrix R k * 
Thus, the MAI effect can be removed by pre multi plying with the pseudoinverse of R k to 
u decorrelate" the user signals. 


R k 1 ■ r* =Ds k 1 > n k (11*44) 

This decorrelating operation leaves only the noise term R~ l n* that can affect the user signal. 
A QAM hard-decision device can be applied to detect the user symbols 

s k = dec (R (11.45) 

Figure 11*14 is the block diagram of a decorrelator MUD receiver. Since the major opera¬ 
tion of a decorrelating MUD receiver lies in the matrix multiplication of R k 1 , the computational 
complexity increases only in the order of O (A/ 2 ). The decorrelator receiver is near-far resistant* 
as detailed by Lupas and Verdii 11 

Minimum Mean Square Error (MSE) Receiver 

The drawback of the decorrelator MUD receiver lies in the noise transformation by R k 1 . In 

fact, when the correlation matrix R k is ill conditioned, the noise transformation has the negative 
effect of noise amplification* To mitigate this risk, a different and more robust MUD* 2 * ]3 is 
to minimize the mean square error by applying a good linear MUD receiver by finding the 
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optimum matrix G*: 


min £{j|s* - G r*|| 2 j (11.46) 

G 

This G still represents a linear detector. Once G has been determined, the MUD receiver simply 
takes a hard decision on the linearly transformed signal, that is, 

5* = dec (G ) (11.47) 


The optimum matrix G can be determined by applying the principle of orthogonality 
[Eq. (8*84), Sec. 8.5]. The principle of orthogonality requires that the error vector 

Sk-Gvk 

be orthogonal to the received signal vector r k . In other words, 


(s* - = 0 

Thus, the optimum receiver matrix G can be found as 

T‘ 


G = s k if 




Because the noise vector n* and the signal vector s k are independent, 


(11.48) 


(11.49) 


S k n k = 


is their cross-correlation. 

In addition, we have earlier established equalities 

=ImxM n k n% = y/f k 

where we use ImxM to denote the M x M identity matrix. Hence, we have 

- V 

r k r” = R k DD H R% + —R k (11.50a) 

s*if =D H R H k (11.50b) 
The optimum linear receiver matrix is therefore 

G k = D H R H k (R k DD H S% + * (11.51) 

It is clear that when the channel noise is zero (i.e,, M = 0), then the optimum matrix given by 
Eq* (11.51) degenerates into 

G k =D H R H k {R k DD H R%y l = (R k D)~ l 

which is essentially the decorrelator receiver. 

The MMSE linear MUD receiver is shown in Fig. 11.15* Similar to the decorrelator 
receiver, the major computational requirement comes from the matrix multiplication of G k , 
The MMSE linear receiver is also near-far resistant. 11 
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Figure IMS 

Minimum mean 
square error 
MUD receiver. 



Decision Feedback Receiver 

We note that both the decorrelator and the MMSE MUD receivers apply linear matrix pro¬ 
cessing. Hence, they are known as linear receivers with low complexity. On the other hand, 
the optimum MUD receiver is nonlinear but requires much higher complexity. There is also a 
very popular suboptimum receiver that is nonlinear. This method is based on the concept of 
successive interference cancellation, known as the decision feedback MUD receiver . 34 ’ [5 

The main motivation behind the decision feedback MUD receiver lies in the fact that in 
a near-far environment, not all users suffer equally. In a near-far environment, the stronger 
signals are actually winners, whereas the weaker signals are losers. In fact, when a particular 
user has a strength y/Ptgt that is stronger than those of all other users, its conventional matched 
filter receiver can in fact deliver better performance than is possible in an environment of equal 
strength. Hence, it would make sense to rank the received users in the order of their individual 
strength measured by {Pigf\. The strongest user QAM symbols can then be detected first, using 
only the conventional matched filter receivers designed for single users. Once the strongest 
user symbols is known, its interference effects on the remaining user signals can be canceled. 
By canceling the strongest user symbol from the received signal vectors, there are only M — 1 
unknown user symbols for detection. Among them, the next strongest user signal can be 
detected more accurately after the strongest interference has been removed. Hence, its effect 
can also subsequently be canceled from received signals, to benefit the M - 2 remaining user 
symbols, and so on. Finally, the weakest user signal will be detected last, after all the MAI has 
been canceled. 

Clearly, the decision feedback MUD receiver relies on the successive inteiference can¬ 
cellation of stronger user interferences for the benefit of weaker user signal detection. For this 
reason, the decision feedback MUD receiver is also known as the successive interference can¬ 
cellation (SIC) receiver. The block diagram of the decision feedback MUD receiver appears 
in Fig. 11.16. Based on Eq. (1131), the following steps summarize the SIC receiver: 

Decision Feedback MUD 


Step 1. Rank all user signal strengths {/\g 2 }. Without loss of generality, we assume that 
P\g\ > g\ > ++ > PM-lglt-i > PMg% 


v ( 0 

>1 


Ji) 


Let 
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Figure 11,16 

Decision feed¬ 
back MUD 
receiver based 
on successive 
interference 
cancellation 
(assuming that 
□II M users are 
ranked in the 
order of 
descending 
gains). 



and 


l = 1 

Step 2, Detect the £th (strongest) user symbol via 

Sf = dec (,<«) 

Step 3. Cancel the first (strongest) user interference from the received signals 

™ xP — r — t 4- 1,..., M 

Step 4. Let l = t H- 1 and repeat step 2 until t = M . 


A decision feedback MUD receiver requires very little computation, since the interference 
cancellation step requires only 0(M 2 ) complexity. It is a very sensible and low-complexity 
receiver. Given correct symbol detection, strong interference cancellation from received weak 
signals completely eliminates the near-far problem. The key drawback or weakness of the 
decision feedback receiver lies in the effect of error propagation. Error propagation takes place 
when, in step 2, a user symbol is detected incorrectly. As a result, this erroneous symbol 
used in the interference cancellation of step 3 may in fact strengthen the MAI, This leads to 
the probability of more decision errors of the subsequent user symbol, which in turn 

can cause more decision errors. Analysis on the effect of error propagation can be found in 
Refs. 14 and 15. 


11.8 MODERN PRACTICAL DSSS CDMA SYSTEMS 

Since the 1990s, many important commercial applications have emerged for spread spectrum, 
including cellular telephones, personal communications, and position location. Here we discuss 
several popular applications of CDMA technology to illustrate the benefits of spread spectrum. 

11.8.1 CDMA in Cellular Phone Networks 

Cellular Networks 

The cellular network divides a service area into smaller geographical cells (Fig. 11.17). Each 
cell has a base station tower to connect with mobile users it serves. All base stations are wired 
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Figure 11*17 

Cellular 

telephone 

system. 



Telephone central 
office 


to the mobile telephone switching office (MTSO), which in turn is wired to the telephone 
central office. A caller communicates via radio channel to its base station, which sends the 
signal to the MTSO. The MTSO connects to the receiver either via the land-based telephone 
system or via another base station. As the caller moves from one cell to another, a handoff 
process takes place. During handoff, the MTSO automatically switches the user to an available 
channel in the new cell while the call is in progress. The handoff is so rapid that users usually 
do not notice it. 

The true ingenuity of the cellular network lies in its ability to reuse the same frequency 
band in multiple cells. Without cells, high-powered transmitters can be used to cover an entire 
city. But this would allow a frequency channel to be used only by one user in the city at any 
moment. This posed serious limitations on the number of channels and simultaneous users. 
The limitation is overcome in the cellular scheme by reusing the same frequencies in all the 
cells except those immediately adjacent. This is possible because the transmitted powers are 
kept small enough to prevent the signals from one cell from reaching beyond the immediately 
adjacent cells. We can accommodate any number of users by increasing the number of cells as 
we reduce the cell size and the power levels correspondingly. 

The 1G (first-generation) analog cellular schemes use audio signal to modulate an FM 
signal with transmission bandwidth 30 kHz. This wideband FM signal results in a good SNR 
but is highly inefficient in bandwidth usage and frequency reuse. The 2G (second-generation) 
cellular systems are all digital. Among them, the GSM and cdmaOne are two of the most widely 
deployed cellular systems. GSM adopts a TDMA technology through which eight users share 
a 200 kHz channel. The competing technology of cdmaOne (knowm earlier as IS-95) is a DSSS 
system. 

Why CDMA in Cellular Systems 

Although spread spectrum is inherently well suited against narrowband interferences and 
affords a number of advantages in the areas of networking and handoff, the key character¬ 
istic underlying the broad application of CDMA for wireless cellular systems is the potential 
for improved spectral utilization. The capacity for improvement has two key sources. First, 
the use of CDMA allows improved frequency reuse. Narrowband systems cannot use the same 
transmission frequency in adjacent cells because of the potential for interference. CDMA has 
inherent resistance to interference. Although users using different spreading codes from adja¬ 
cent cells will contribute to the total interference level, their contribution will be significantly 
less than the interference from the same cell users. This leads to a much improved frequency 
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Figure 11.18 
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Figure 11.19 
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reuse efficiency. In addition, CDMA provides better overall capacity when the data traffic load 
is dynamic. This is because users in a lightly loaded CDMA system would have a lower inter¬ 
ference level and better performance, whereas TDMA users with fixed channel bandwidth do 
not enjoy such benefit. 

CDMA Cellular System: cdmaOne (IS-95) 

The first commercially successful CDMA system in cellular applications was developed by the 
Electronic Industries Association (E1A) as interim standard-95 (IS-95). Now under the official 
name of cdmaOne, it employs DSSS by adopting 1*2288-Mchip/s spreading sequences on 
both uplink and downlink. The uplink and downlink transmissions both occupy 1.25 MHz of 
RF bandwidth, as illustrated in Fig* 11.18. 

The QCELP (Qualcomm code-excited linear prediction) vocoder is used for voice encod¬ 
ing. Since the voice coder exploits gaps and pauses in speech, the data rate is variable from 
L2 to 9.6 kbit/s* To keep the symbol rate constant, whenever the bit rate falls below the peak 
bit rate of 9.6 kbit/s, repetition code is used to fill the gaps* For example, if the output of 
the voice coder (and subsequently the convolutional coder) falls to 2.4 kbit/s, the output is 
repeated three more times before it reaches the interleaver* The transmitter of cdmaOne takes 
advantage of this repetition time by reducing the output power during three out of the four 
identical symbols by at least 20 dB. In this way, the multiple-access interference is diminished. 
This “voice activity gating” reduces MAI and increases overall system capacity. 

The modulation of cdmaOne uses QPSK on the downlink, and the uplink uses a variant of 
QPSK known as the offset QPSK (or OQPSK). There are other important differences between 
the torward and reverse links. Figure 11.19 outlines the basic operations of spreading and 
modulation on the forward link. After a rate 1/2 convolutional error correction code, the voice 
data becomes 19.2 kbit/s* Interleaving then shuffles the data to alleviate burst error effects, 
and long-code scrambling provides some nominal privacy protection. The data rate remains 
at 19*2 kbit/s before being spread by a length 64 Walsh-Hadamard short-code to result in a 
sequence of rate 1.2288 Mbit/s. Because forward link uses synchronous transmissions, in the 
absence of channel distortions, there can be as many as 64 orthogonal data channels, each using 
a distinct Walsh-Hadamard code. Both the in-phase (I) and the quadrature (Q) components of 
the QPSK modulations carry the same data over the 1.25 MHz bandwidth, although different 
masking codes are applied to I and Q. 

The performance of the reverse link is of greater concern for two reasons. First, as dis¬ 
cussed earlier, the reverse link is subject to near-far effects. Second, since all transmissions 
on the forward link originate at the same base station, it uses the orthogonal Walsh-Hadamard 
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spreading codes to generate synchronous signals with zero cross-correlation. Reverse-link does 
not enjoy this luxury. For this reason, more powerful error correction (rate 1/3) is employed 
on the reverse link. Still, like the forward link, the raw QCELP vocoder bit rate is 9*6 kbit/s, 
which is eventually spread to 1.2288 Mehip/s over a 1 *25 MHz bandwidth* 

As mentioned earlier, the near-far problem needs to be addressed when spread spectrum is 
utilized in mobile communications. To combat this problem, IS-95 uses power control. On the 
forward link there is a subchannel for power control purposes. Every 1*25 ms, the base station 
receiver estimates the signal strength of the mobile unit. If it is too high, the base transmits a 
1 on the subchannel. If it is too low, it transmits a 0* In this way, the mobile station adjusts its 
power based on the 800 bit/s power control signal to reduce interference to other users. 


3G Cellular Services 16-19 

In the new millennium, wireless service providers are shifting their voice-centric 2G cellular 
systems to the next-generation (3G) wireless systems, which are capable of supporting high¬ 
speed data transmission and internet connection. For this reason, the International Mobile 
Telecommunications-2000 standard (IMT-2000) is the global standard for third-generation 
wireless communications. IMT-200Q provides a framework for worldwide wireless access of 
fixed and mobile wireless access systems. The goal is to provide wireless cellular coverage up 
to 144 kbit/s for high-speed mobile, 384 kbit/s for pedestrian, and 2.048 Mbit/s for indoor users. 
Among the 3G standards, there are three major wireless technologies based on CDMA DSSS, 
namely, the two competing versions of wideband CDMA from the 3rd Generation Partnership 
Project (3GPP) and the 3rd Generation Partnership Project 2 (3GPP2), plus the TD-SCDMA 
from the 3GPP for China* 

Because 3G cellular systems continue to use the existing cellular band, a high data rate 
for one user means a reduction of service for other active CDMA users within the same cell. 
Otherwise, given the limited bandwidth, it is impossible to serve the same number of active 
users as in cdmaOne while supporting data rate as high as 2,048 Mbit/s. Thus, the data rate 
to and from the mobile unit must be variable according to the data traffic intensity within the 
cell* Since most data traffic patterns (including internet usage) tend to be bursty, variable rate 
data service offered by 3G cellular is suitable for such applications* 

Unlike FDMA and TDMA, CDMA provides a perfect environment for variable data rate 
and requires very simple modifications* While FDMA and TDMA would require grouping 
multiple frequency bands or time slots dynamically to support variable rate, CDMA needs to 
change only the spreading gain* In other words, at higher data rates, a CDMA transmitter can 
use a lower spreading factor In this mode, its MAI to other users is high, and fewer such users 
can be accommodated* At lower data rates, the transmitter uses a larger spreading factor, thus 
allowing more users to transmit. 

In 3GPP2’s CDMA2000 standard, there are two ratio transmission modes; lxRTT utiliz¬ 
ing one 1.25 MHz band and 3xRTT that aggregates three 1.25 MHz bands* On lxRTT forward 
link, the maximum data rate is 307*2 kbit/s with a spreading gain of 4* Thus, the chip rate 
is still 1.2288 Mchip/s. A more recent 3GPP2 release is called CDMA 2000 1 x EV-DO revi¬ 
sion A, where EV-DO stands for “evolution data-optimized*” It can support a peak data rate 
of 3*1 Mbit/s on the forward link of 1.25 MHz bandwidth. Tt does so by applying adaptive 
coding and adaptive modulations, including QPSK, 8-PSK, and 16-QAM. At the peak rate, 
the spreading gain is 1 (he., no spreading). 

At the same time, the WCDMA by 3GPP applies similar ideas* Unlike CDMA2000, 
WCDMA has a standard bandwidth of 5 MHz. When spreading is used, the chip rate is 4*096 
Mchip/s. On downlink, the variable spreading factor of 3GPP WCDMA ranges from 512 to 4* 
With QPSK modulation, this provides a variable data rate from 16 kbit/s to 2*048 Mchip/s* 
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Similar to CDMA2000,3GPP WCDMAalso has a counterpart to EV-DO known as high-speed 
packet access (HSPA). On downlink, the recent HSPA release (Release 6) achieves the peak 
rate of 14.4 Mbit/s. However, existing deployments can support a peak rate of only 7.2 Mbit/s. 
Still, at this high rate, most data users would be quite satisfied, with the exception perhaps of 
high-definition TV viewers. 

Power Control vs. MUD 

It is interesting to note that despite intense academic research interest in multiuser CDMA 
receivers (in the 1980s and 1990s), all cellular CDMA systems described here rely on power 
control to combat the near-far problem. The reason lies in the fact that power control is quite 
simple to implement and has proven to be very effective. On the other hand, MUD receivers 
require more computational complexity. To be effective, MUD receivers also require too much 
channel and signal information about all active users. Moreover, MUD receivers alone cannot 
completely overcome the disparity of performance in a near-far environment. 

1 1.8.2 CDMA in the Global Positioning System (GPS) 

What Is GPS? 

The Global Positioning System (GPS) is the only fully functional global satellite navigation 
system. Utilizing a constellation of at least 24 satellites in medium Earth orbit to transmit 
precise RF signals, the system enables a GPS receiver to determine its location, speed, and 
direction. 

A GPS receiver calculates its position based on its distances to three or more GPS satellites. 
Measuring the time delay between transmission and reception of each GPS microwave signal 
gives the distance to each satellite, since the signal travels at a known speed The signals also 
carry information about the satellites' location. By determining the position of, and distance to, 
at least three satellites, the receiver can compute its position using triangularization. Receivers 
typically do not have perfectly accurate clocks and therefore track one or more additional 
satellites to correct the receiver’s dock error. 

Each GPS satellite continuously broadcasts its (navigation) message via BPSK at the rate 
of 50 bit/s. This message is transmitted by means of two CDMA spreading codes; one for the 
coarse/acquisition (C/A) mode and one for the precise (P) mode (encrypted for military use). 
The C/A spreading code is a PN sequence wdth period of 1023 chips sent at 1.023 Mchip/s. 
The spreading gain is L = 20,460. Most commercial users access only the C/A mode.* 

Originally developed for the military, GPS is now finding many uses in civilian life such as 
marine, aviation, and automotive navigation, as well as surveying and geological studies. GPS 
allows a person to determine the time and the person’s precise location (latitude, longitude, and 
altitude) anywhere on earth with an accuracy of inches. The person can also find the velocity 
with which he or she is moving. GPS receivers have become small and inexpensive enough to 
be carried by just about everyone in cars and boats. Handheld GPS receivers are plentiful and 
have even been incorporated into popular cellular phone units. 

How Does GPS Work? 

A GPS receiver operates by measuring its distance from a group of satellites in space, which 
are acting as precise reference points. Since the GPS system consists of 24 satellites, there will 
always be more than four orbiting bodies visible from anywhere on Earth. The 24 satellites 


* The P spreading code rare is 10.23 Mchip/s with a spreading gain of L = 204,600. The P code period is 
6.1871 x 10 12 bits long. Tn fact, at the chip rate of 10.23 Mchip/s, the code period is one week long! 
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are located in six orbital planes at a height of 22,200 km. Each satellite circles the earth in 
12 hours. The satellites are constantly monitored by the U.S. Department of Defense, which 
knows their exact locations and speeds at every moment. This information is relayed back to 
the satellites. All the satellites have atomic clocks of unbelievable precision on board and are 
synchronized to generate the same PN code at the same time. The satellites are continuously 
transmitting this PN code and the information about their locations and time. A GPS receiver 
on the ground is also generating the same PN code, although not in synchronism with that of 
the satellites. This is because of the necessity to make GPS receivers inexpensive. Hence, the 
timing of thePN code generated by the receiver will be off by an amount of a seconds (timing 
bias) from that of the PN code of the satellites. 

To begin, let us assume that the timing bias a = 0. By measuring the time delay between 
its own PN code and that received from one satellite, the receiver can compute its distance 
d from that satellite. This information places the receiver anywhere on a sphere of radius d 
centered at the satellite location (which is known), as shown in Fig. 11,20a. Simultaneous 
measurements from three satellites place the receiver on the three spheres centered at the three 
known satellite locations. The intersection of two spheres is a circle (Fig. 11.20b), and the 
intersection of this circle with the third sphere narrow's down the location to just two points, as 
shown in Fig. 11.20c. One of these points is the correct location. But which one? Fortunately, 
one of the two points would give a ridiculous answer. The incorrect point may not be on Earth, 
or it may indicate an impossibly high receiver velocity. The computer in a GPS receiver has 
various techniques for distinguishing the correct point from the incorrect one. 

In practice, the timing bias a is not zero. To solve this problem, we need a distance 
measurement from a fourth satellite. A user locates his or heT position by receiving the signal 
from four of the possible 24 satellites, as show n in Fig. 11.20d. There arc four unknowns, the 
coordinates in the three-dimensional space of the user along with a timing bias in the user's 
receiver. These four unknowns can be solved by using four range equations to each of the four 
satellites. 

Since DSSS signals consist of a sequence of extremely short pulses, it is possible to 
measure their arrival times accurately. The GPS system can result in accuracies of 10 meters 
anywhere on Earth, The use of differential GPS can provide accuracy within centimeters. Tn 
this case we use one terrestrial location whose position is known exactly. Comparison of its 
known coordinates with those read by a GPS receiver (for the same location) gives us the error 
(bias) of the GPS system, which can be used to correct the errors of GPS measurements of other 
locations. This is based on the fact that satellite orbits are so high that any errors measured 
by one receiver will be almost exactly the same for any other receiver in the same locale. 
Differential GPS is currently used in such diverse applications as surveying, laying petroleum 
pipelines, aviation systems, marine navigation systems, and preparing highly accurate maps 
of everything from underground electric cabling to power poles. 


Why Spread Spectrum in GPS? 

The use of spread spectrum in the GPS system accomplishes three tasks. First, the signals from 
the satellites can be kept from unauthorized use. Second, and more important in a practical 
sense, the inherent processing gain of spread spectrum allows reasonable power levels to be 
used. Since the cost of a satellite is proportional to its weight, it is desirable to reduce the power 
required as much as possible. In addition, since each satellite must see an entire hemisphere, 
very little antenna gain is possible. For high accuracy, short pulses are required to provide 
fine resolution. This results in high spectrum occupancy and a received signal that is several 
decibels below the noise floor. Since range information needs to be calculated only about once 
every second, the data bandwidth need be only about 100 Hz. This is a natural match for 
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Figure 11,20 
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spread spectrum. Despreading the received signal in the receiver, in turn, yields a significant 
processing gain, thus allowing good reception at reasonable power levels. The third reason for 
spread spectrum is that each satellite can use the same frequency band, yet there is no mutual 
interference owing to the near orthogonality of each user's signal. 

Each satellite circles the earth in 12 hours and emits two PN sequences modulated in phase 
quadrature at two frequencies. Two frequencies are needed to correct for the delay introduced 
by the ionosphere. 


11.8.3 IEEE 802.11 b Standard for Wireless LAN 

IEEE 802. Lib is a commercial standard developed for wireless local area networks (WLAN) 
to provide high-speed wireless connection to (typically) laptop computers. 
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Like its predecessor IEEE 802,11, IEEE 802.11b operates in the license-free ISM band of 
2.4 to 2,4835 GHz. Similar to cellular networks, all laptop computers within a small coverage 
area form 1 -to-1 communication links with an “access point/' The access point is typically 
connected to the Internet via a high-speed connection that can deliver the traffics to and from 
laptop computers. In this way, the access point serves as a bridge between the computers and 
the Internet, 

The TSM band is populated with signals from many unlicensed wireless devices such as 
microwave ovens, baby monitors, cordless phones, and wireless controllers. Hence, to transmit 
WLAN data, interference resistance against these unlicensed transmission is essential. For this 
reason, spread spectrum is a very effective technology. 

The simple FSK used in the FHSS IEEE 802.11 provides up to 2 Mbit/s data rate and 
is simple to implement. Still, the link data rate is quite low. Because the laptop is a relative 
powerful device capable of supplying moderate levels of power and computation, it can support 
more complex and faster modulation. TEEE 802.11b eliminates the FHSS option and fully 
adopts the DSSS transmission. It pushes the data rate up to 11 Mbit/s, which is reasonably 
satisfactory to most computer connections. 

Internationally, there are 14 DSSS channels defined over the TSM band, although not all 
channel are available in every country. In North America, there are 11 (overlapping) channels 
of bandwidth 22MHz. The channel spacing is 5 MHz, Table 11,2 illustrates the 11 DSSS 
channels. 

The chip rate of IEEE 802.11b is 11 MHz, and the spread spectrum transmission band¬ 
width is approximately 25 MHz. The 802.11b data rate can be 1, 2, 5.5, and 11 Mbit/s. For 
I and 2 Mbit/s data rates, differential BPSK and differential QPSK are used, respectively. 
At high data rates of 5.5 and 11 Mbit/s, a more sophisticated complementary code keying 
(CCK) was developed. The link data rate is established based on how' good the channel 
condition is. The different spreading gains for the 802.11b DSSS modulation are given in 
Table 11.3. 

Note that each access point may serve multiple links. Additionally, there may be more 
than one access point at a given area. To avoid spectral overlap, different network links must 
be separated by a minimum of live channel numbers. For example, channel 1, channel 6, 
and channel 11 can coexist without mutual interference. Often, a neighborhood may be very 


TABLE 11,2 

2.4GHz ISM Channel Assignment in IEEE 802.1 lb 


Channel 

1 2 

3 

4 

5 

6 

7 

3 

9 

10 

11 

Center/, 

2,412 2.417 

2.422 

2427 

2.432 

2.437 

2.442 

2.447 

2.452 

2.457 

2.462 

GHz 












TABLE 11*3 

Modulation Format and the Spreading Factor 
in IEEE 802.11 b Transmission 


Chip rate 


11 MHz 



Data rate 

1 Mbit/s 

2 Mbit/s 

5.5 Mbit/s 

11 Mbit/s 

Modulation 

Differential BPSK 

Differential QPSK 

CCK 

CCK 

Spreading gain 

11 

H 

2 

1 
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Figure 11.21 
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congested with multiple network coverage. Thus, spectral overlapping becomes unavoidable. 
When different networks utilize spectrally overlapping channels, signal collisions may take 
place. Data collisions are not resolved by radio transmitters and receivers (physical layer). 
Rather, network protocols are developed to force all competing networks and users to back off 
(i.e., to wait for a timer to expire before transmitting a finite data packet). In 802.11 WLAN, 
the timer is set to a random value based on a traffic-dependent uniform distribution. This 
backoff protocol to resolve data collisions in WLAN is known as the distributed coordinator 
function (DCF), 

To allow multiple links to share the same channel, DCF forces each link to vacate the 
channel for a random period of time. This means that the maximum data rate of 11 Mbit/s 
cannot be achieved by any of the competing users. As shown in Fig. 11,2), the two computers 
both using channel 11 to connect to the access point must resort to DCF to reduce their access 
time and effectively lower their effective data rate. In this case, perfect coordination would 
be able to allocate 11 Mbit/s equally between the two users. This idealistic situation is really 
impossible under the distributed protocol of DCF. Under DCF, the maximum throughput of 
either user would be much lower than 5.5 Mbit/s. 

IEEE 802.1 lb is without aquestiononeofthemost successful of the wireless standards that 
are responsible for opening up the commercial WLAN market. Nevertheless, to further improve 
the spectral efficiency and to increase the possible data rate, a new modulation scheme known 
as orthogonal frequency division multiplexing (OFDM) was incorporated into the follow-up 
standards of IEEE 802.1 la and IEEE 802.1 lg.* The principles and analysis of OFDM will be 
discussed next in Chapter 12. 


11.9 MATLAB EXERCISES 

In this section of computer exercise, we pro vide some opportunities for readers to learn firsthand 
about the implementation and behavior of spread spectrum communications. We consider the 
cases of frequency hopping spread spectrum (FHSS), direct sequence spread spectrum (DSSS) 
or CDMA, and multiuser CDMA systems. We test the narrowband jamming effect on spread 
spectrum communications and the near-far effect on multiuser CDMA systems. 


* IEEE 802.11 g operates in the same ISM band as in IEEE 802.11b and must be backward compatible. Thus, IEEE 
802, II g includes both the CDMA and the OFDM mechanisms, IEEE 802,11a, however, operates in the 5 GHz band 
and uses OFDM exclusively. 
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COMPUTER EXERCISE 111: FHSS FSK COMMUNICATION 
UNDER PARTiAL BAND JAMMING 

The first MATLAB program, Exli_l ,iti, implements an FHSS communication system that utilizes FSK 
and noncoherent detection receivers. By providing an input value of 1 (with jamming) and 0 (without 
jamming), we can illustrate the effect of FHSS against partial band jamming signals. 

TABLE 11*4 

Parameters Used in Computer Exercise 11.1 

Number of users m = 1 

Spreading factor L = 8 

(number of FSK bands) 

Number of hops per symbol per bit Lf t = I 

Modulation BFSK 

Detection Noncoherent 

Partial band jamming 1 fixed FSK band 


In Exl 1 _ 1 . m, the parameters of the FHSS system are given in Table 11.4. When partial band 
jamming is turned on, a fixed but randomly selected FSK channel is blanked out by jamming. Under 
additive white Gaussian channel noise, the effect of partial band jamming on the FHSS user is shown in 
Fig, 11,22, Clearly, we can see that without jamming, the FHSS performance matches that of the FSK 
analysis in Sec. LU and Chapter 10, When partial jamming is turned on, the BER of the FHSS system 
has a floor of 1 /(2L) as shown in Eq. (11.4). As L increase from 4 to 8, and to 16, the performance clearly 
improves. 

% MATLAB PROGRAM <Exll_l.m> 

% This program provides simulation for FHSS signaling using 
% non-coherent detection of FSK. 

% The jammer will jam 1 of the L frequency bands and 


Figure 11.22 

Performance of 
FHSS 

noncoherent 
detection under 
partial band 
jamming. 
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% can be turned on or off by inputting jamming=l or 0 
% Non-coherent MFSK detection 

% only needs to compare the magnitude of each frequency bin. 

% 

clear;elf 

n^lOOOO; %Number of data symbols in the simulation 

L=3; % Number of frequency bands 

Lh=l; % Number of hops per symbol (bit) 

m=l; % Number of users 

% Generating information bits 

s T _data=round(rand(n,m) ) ; 

% Turn partial band jamming on or off 

jammingsinput('jamming=? (Enter 1 for Yes, 0 for No)'}; 

% Generating random phases on the two frequencies 
xbasel=[exp(j *2 *pi*rand(Lh*n,1)}]; 
xbaseO=[exp(j *2*pi*rand(Lh*n,1))]; 

% Modulating two orthogonal frequencies 

xmodsig=[kron(s_data,ones(Lh,1)).*xbasel kron((l-s_data),ones(Lh,1)),*xbas. 
clear xbaseO xbasel; 

% Generating a random hopping sequence nLh long 

Phop=round(rand(Lh*n, 1)*(b-1) ) +1; % PN hopping pattern; 

Xsiga^sparse(1:Lh*n,Phop,xmodsig{ : , 1) ) ; 

Xsigb=sparse[1;bh*n,Phop,xmodsig( : , 2} } ; 

% Generating noise sequences for both frequency channels 
noisel = randn(Lh*n H 1)+j *randn(Lh*n,1) ; 
noise2=randn(Lh*n,1)+j*randn(Lh*n,1); 

Nsiga^sparse(1:Lh*n, Phop,noisel); 

Nsigb=sparse(1:Lh*n,Phop,noise2); 
clear noisel noise2 xmodsig; 

BER=[J; 

BER_az =[] ; 

% Add a jammed channel {randomly picked) 
if (jamming) 

nch=round{rand*(L-l))+1; 

Xsiga{:,nch)=Xsiga(;,nch)*0; 

Xsigb(:,nch)=Xsigb(:,nch)*0; 

Nsiga(:,nch)=Nsiga(: H nch)*0; 

Nsigb(:,nch)=Nsigb(:,nch)*0; 
end 

% Generating the channel noise (AWGN) 
for i=l:10, 

Eb2N(i)=i; %(Eb/N in dB) 

Eb2N_num=lG"(Eb2N{i)/ID); % Eb/N in numeral 

Var_n=l/(2*Eb2N_num); %1/SNR is the noise variance 

signois=sqrt(Var_n); % standard deviation 

ychl=Xsiga+signois*Nsiga; % AWGN complex channels 

ych2=Xsigb+signois*Nsigb; % AWGN channels 

% Non-coherent detection 

for kk=0:n-1, 

Yvecl= [ ] ;Yvec2=[] ; 
for kk2=l;Lh, 

Yvecl=[Yvecl ychl(kk*Lh+kk2,Phop(kk*Lh+kk2))]; 

Yvec2=[Yvec2 ych2(kk*Lh+kk2,Phop(kk*Lh+kk2})J; 
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end 

ydiml=Yvecl*Yvecl'; 
ydim2=Yvec2*Yvec2' ; 
dec(kk+1}-(ydiml>ydim2); 

end 

clear ychl ych2; 

% Compute BER from simulation 
BER=[BER; sum{dec'"-s_data)/n]; 

% Compare against analytical BER. 

BER_az=[BER_az; 0 *5* exp(-Eb2N_num/2)] ; 

end 

f igber=semilogy(Eb2N, BER_az , 'k- 1 , Eb2N, BER, ' k-o '); 
set(figber, 'Linewidth' H 2); 

legend{'Analytical BER', "FHSS simulation'}; 

fx=xlabel('E_b/N (dB)'); 
fy=ylabel('Bit error rate'); 

set{fx,'FontSize',11); set(fy, ' Fontsize',11); 


COMPUTER EXERCISE 11.2: DSSS TRANSMISSION OF QPSK 

In this exercise, we performance a DSSS baseband system test under narrowband jamming. For spreading 
in this ease, we apply the Barker code of length 11 

pcode =[111-1-1-11 -1 -1 1 -1] 

for spreading because of its nice spectrum spreading property as a short code. We assume that the channel 
noises are additive white Gaussian. MATLAB program Exll_2b. m provides the results of a DSSS user 
with QPSK modulation under a narrowband jamming. 

% MATLAB PROGRAM <Exll_2b,m> 

% This program provides simulation for DS-CDMA signaling using 
% coherent QAM detection* 

% To illustrate the CDMA spreading effect, a single user is spread by 
% PN sequence of different lengths* Jamming is added as a narrowband; 

% Changing spreading gain Lc; 
clear;clf 

Ldata=20000; % data length in simulation; Must be divisible by 8 

Lc=ll; % spreading factor vs data rate 

% can also use the shorter Lc=7 
% Generate QPSK modulation symbols 

data_sym=2 *round(rand[Ldata,1})-1 + j *(2 *round{rand(Ldata,1) ) -1) ; 
jam_data = 2 *round(rand{Ldata,1})-1 + j *(2*round(rand(Ldata,1) ) -1) ; 

% Generating a spreading code 
pcode=[1 11-1-1-11 -1 -1 1 -1]'; 

% Mow spread 

x_in=kron(data_sym,pcode); 

% Signal power of the channel input is 2*Lc 
% Jamming power is relative 
SIR=10; % SIR in dB 

Pj = 2*Lc/ (10"{SIR/10) ) ; 

% Generate noise (AWGN) 

noiseq=randn(Ldata*Lc,1)+j*randn(Ldata*Lc,1); % Power is 2 
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% Add jamming sinusoid sampling frequency is fc = Lc 
janwnod=kron(jam_data,ones(Lc,1) ) ; clear jam_data; 

jammer= sqrt(Pj/2)*jam_mod.*exp(j *2*pi*0.12*(1:Ldata*Lc )) . 1 ; %fj/£c=0.12. 
clear jam_mod; 

[ P, x]=pwelch(x_in,[],[],[4096] ( Lc,'twoside"); 
figured) ; 

semilogy(x-Lc/2,fftshift(P)>; 
axis([-Lc/2 Lc/2 1, e-2 l.e2]}; 
grid; 

xfont=xlabel('frequency (in unit of 1/T_s)'); 
yfont=ylabel('CDMA signal PSD'); 

set {xf ont, ' Font Size dll); set {yf ont, ' Font Size d 11) ; 

[P,xj =pwelch(jammer+x_in, [], [J, [4 096] ,Ledtwoside') ; 

figure[2) ;semilogy(x-Lc/2 H fftshift{P)) ; 

grid; 

axis{[-Lc/2 Lc/2 l + e-2 l*e2])? 

xfont=xlabel('frequency (in unit of 1/T_s}'); 

yfont=ylabel('CDMA signal + narrowband jammer PSD'); 

set (xfont, ' Font Size dll) ? set [yfont, ' Font Size dll); 

BER= [ ] ; 

BER__az= [ ] ; 


for i=l:10, 

Eb2N(i)=(i-1); 

Eb2N_num=l0"(Eb2N(i)/10); 
Var_n=Lc/(2*Eb2N_num); 
signois=sqrt(Var_n); 
awgnois=signois*noiseq; 

% Add noise to signals at the 
y_out=x_in+awgnois+jammer; 
Y_out=reshape(y_out,Lc,Ldata) 

% Despread first 
z_out=Y_out*pcode; 


%(Eb/N in dB) 

% Eb/N in numeral 
%1/SNR is the noise variance 
% standard deviation 
% AWGN 
channel output 

d clear y_out awgnois; 


% Decision based on the sign of the samples 
decl=sign(real(z_out))+j*sign[imag(z_out)); 

% Now compare against the original data to compute BER 
BER=[BER;sum{[real(data_sym)~=real(decl);... 

imag(data_sym)"=imag(decl)])/(2*Ldata)]; 

BER_az=[BER_az;0 * 5*erfc(sqrt(Eb2N_num))]; ^analytical 

end 

figure(3) 

figber=semilogy(Eb2N,BER_az,'k-dEb2N,BER,'k-o'); 

legend('No jammingd 'Narrowband jamming {-10 dB)'); 

set(figber,'LineWidth d 2) ; 

xfont=xlabel('E_b/N (dB)'); 

yfont=ylabel('Bit error rated; 

titledDSSS (CDHA) with spreading gain = lid; 


Because the spreading factor in this case is L = 11, the DSSS signal occupies a bandwidth approx¬ 
imately 11 times wider. From the user signal carrier, we add a narrowband QPSK jamming signal with a 
carrier frequency offset of 132/71 The signal-to-interference ratio (SIR) can be adjusted. In Fig. 11,23, 
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Figure 11,23 

(a) Power 
spectral densities 
of DSSS signal 
using Barker 
code of length 
I 1 for 
spreading: 

(a) without 
narrowband 
jamming; 

(b) with 

no rrowband 
jamming at 
SIR = 1 QdB. 




we can witness power spectral densities before and after the addition of the jamming signal when 
SIR = 10 dB, Despreading at the receiver enables us to find the resulting BER of the QPSK signal 
under different jamming levels (Fig, 11,24). As the jamming signal becomes stronger and stronger, we 
will need to apply larger spreading factors to mitigate the degrading effect on the BER, 


COMPUTER EXERCISE 1 1.3: MULTIUSER DS-CDMA SYSTEM 

To implement DS-CDMA systems, we must select multiple spreading codes wfith good cross-correlation 
and autocorrelation properties. Gold sequences arc a very well-known class of such good spreading 
codes. Note that the Gold sequences are not mutually orthogonal. They have some nonzero but small 
cross-correlations that can degrade the multiuser detection performance. We select four Gold sequences 
to spread four QPSK users of equal transmission power. No near-far effect is considered in this example. 
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The first MATLAB program, gold3 lcode .m, assigns four Gold sequences of length 31 to the 
four QPSK modulated user signals: 

% MATLAB PROGRAM <gold3lcode,m> 

% to generate a table of 4 Gold sequence 
% with length 31 each. 

GPN=[1 11-1 

-11-11 
-1-111 
1 1-1-1 
-1 -1 -1 -1 

1111 
1 1 -1 -1 

-1 -1 -1 1 

-1 1 -1 -1 

1-1-1 1 
-1-111 
11-11 
1 -1 -1 -1 

-1111 
1-111 
-111-1 
-1 -1 1 -1 

-111-1 
111-1 
1-111 
1-1 1-1 

1 1-1-1 

1111 
1 1-1-1 

-1 -1 -1 -1 

11-11 
1 -1 -I -1 

-11-11 
1 1-1-1 

1111 
1 1 1 11; 

The main MATLAB program, Exll_3 ,m, completes the spreading of the four user signals. The 
four spread CDMA signals are summed together at the receiver before detection. Each of the four users 
will apply the conventional despreader (matched filter) at the receiver before making the symbol-by- 
symbol decision. We provide the resulting BER of all four users in Fig. 11,25 under additive white 
Gaussian noise. We also give the single-user BER in AWGN channel as a reference. All four users have 
the same BER. The small degradation of the multiuser BER from the single-user BER is caused by the 
nonorthogonal spreading codes. 

% MATLAB PROGRAM <Exll_3 + m> 

% This program provides simulation for multiuser DS-CDHA signaling using 
% coherent QPSK for 4 users. 

% 

%clear;elf 

Ldata=lQQ00; % data length in simulation; Must be divisible by 8 

Lc=31; % spreading factor vs data rate 

%User number = 


4; 
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Figure 11.25 

Performance of 

DS-CDMA 

conventional 

single-user 

detection without 

the nearTar 

effect. 



% Generate QPSK modulation symbols 

data_sym=2*round(rand(Ldata,4))-1 + j *(2*round(rand(Ldata,4))-1); 

% Select 4 spreading codes (Gold Codes of Length 11) 

gold31code; 

pcode=GPN; 

% Spreading codes are now in matrix pcode of 31x4 
PowerMat=diag{sqrt([1 1 1 1 ] ) ) ; 
pcodew=pcode*PowerMat; 

% Now spread 

x_in=kron(data_sym(: H 1),pcodew(:,1))+kron(data_sym(:, 2 ) ,pcodew(:,2)) + 
kron[data_sym(:,3 >,pcodew( : , 3 ) )+kron(data_sym(:,4) ,pcodew[;,4)); 

% Signal power of the channel input is 2*Lc 

% Generate noise (AWGN) 

noiseq^randn(Ldata*Lc, 1) + j*randn(Ldata*Lc, 1} ; % Power is 2 

BER1=[]; 

BER2=[]; 

BER3 =[] ; 

BER4=[]; 

BER_az=[J ; 


for i=1:12, 

Eb2N(i) = (i-l) ; 

Eb2 N_num= 10" (Eb2N(i) /10) ? 
Var_n=Lc/ (2*Eb2N__num) ; 
signois=sqrt(Var_n); 
awgnois=signois*noiseq; 

% Add noise to signals at the 


%(Eb/N in dB) 

% Eb/N in numeral 
%1/SNR is the noise variance 
% standard deviation 
% AWGN 

channel output 
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y_out=x_in+awgnois; 

Y_out=reshape(y_out,Lc,Ldata).'; clear y_out awgnois; 

% Despread first 
z_out=Y_out * pc ode ; 

% Decision based on the sign of the samples 
dec=sign(real(z_out))+j *sign[imag(z_out)j ; 

% Now compare against the original data to compute BER 
BER1=[BERl * sum([real(data_sym(:,1>)"=real(dec(:,1)>;*** 
imag(data_sym(:,1))~=imag(dec(:,1))])/(2*Ldata)1; 

BER2 =[BER2;sum([real[data_sym(:,2})~=real(dec( : , 2) ) ; . . . 

imag(data_sym( : , 2 ) )~=imag[dec {:, 2 ) }])/(2*Ldata)]; 

BER3 =[BER3;sum([real(data_sym(:,3})~=real(dec(:,3))?... 

imag (data_sym ( : H 3 ) ) ~=imag (dec ( : , 3) ) ] ) / {2 *Ldata) ] ; 

BER4 =[BER4;sum([real(data_sym(:,4})~=real(dec(; H 4)) ; . . . 

imag(data_sym(:,4))~ = imag(dec(:,4))])/(2 *Ldata)] ; 

BER_az=[BER_az;0 * 5*erfc(sqrt(Eb2N_num))]; %analytical 

end 

BER=[BERl BER2 BER3 BER4]; 
figure(1) 

figber-semilogy(Eb2N,BER_az, ' k- MEb2N,BERl, 'k-o',Eb2N,BER2 , 'k-s', ... 

Eb2N, BER3 , 'k-v' , Eb2N, BER4 H 'k-*M ; 
legend('Single-user (analysis)','User 1 BER','User 2 BER' H 
'User 3 BER', 1 User 4 BERM 
axis([0 12 Q.99e-5 I.eO]); 
set(figber,'LineWidth ', 2 ); 

xlabel('E_b/N (dB)M;ylabel('QPSK bit error rate') 
title('4-user CDMA BER with Gold code of length 31M; 


COMPUTER EXERCISE 1 1.4: MULTIUSER CDMA DETECTION 
IN NEAR-FAR ENVIRONMENT 

We can now modify the program in Computer Exercise 11.3 to include the near-far effect. Among the 
four users, user 2 and user 4 have the same power and are the weaker users from far transmitters. User 
1 is lOdB stronger, while user 3 is 7 dB stronger. In this near-far environment, both users 2 and 4 suffer 
from strong interference (users 1 and 3) signals due to the lack of code orthogonality. Note that the two 
w^eak users do not have the same level of multiuser interference (MUI) from other users because of the 
difference in their correlations. 

MATLAB program Exll_4a.mcompares the performance of the conventional single-user receiver 
with the performance of the decorrelator multiuser detector (MUD) described in See, 1L7. We show the 
performance results of user 2 and user 4 in Fig. 11.26. 

% MATLAB PROGRAM <Exll_4a.m> 

% This program provides simulation for multiuser CDMA system 
% that experiences the near-far effect due to user Tx power 
% variations * 

% 

% Decorrelator receivers are 

% applied to mitigate the near-far effect 

% 

%clear;elf 
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Figure 11.26 

Performance 
comparison of 
decorrelator 
MUD in com¬ 
parison with the 
conventional 
single-user 
receiver. 



Ldata=100000; % data length in simulation; Must be divisible by 8 

Lc=31; % spreading factor vs data rate 

%User number = 4; 

% Generate QPSK modulation symbols 

data_sym=2 + round(rand(Ldata,4))-l+j*(2*round(rand(Ldata H 4})-1); 

% Select 4 spreading codes (Gold Codes of Length 11) 

gold31code; 

pcode=GPN; 

% Spreading codes are now in matrix pcode of 31x4 
PowerMat=diag(sqrt([10 1 5 1])); 
pcodew=pcode*PowerMat; 

Rcor=pcodew'*pcodew; 

Rinv=pinv(Rcor); 

% Now spread 

x_in=kron (data_sym( : r 1) , pcodew ( : t 1) ) +kron(data_sym ( : , 2) , pcodew{ :,2) ) + . . . 
kron(data_sym {:, 3 ) , pcodew (: , 3 ) ) +kron(data_sym ( : , 4) H pcodew {: , 4} ) ; 

% Signal power of the channel input is 2*Lc 

% Generate noise (AWGN) 

noiseq=randn(Ldata*Lc, 1) + j*randn(Ldata*Lc,1); % Power is 2 

BERb2 =[]; 

BERa2=[]; 

BERb4=[]; 

BERa4=[]; 

BER_az=[ ] ; 


for 1=1:13, 

Eb2N(i) = (i-l) ; 


%(Eb/N in dB) 
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Eb2N_num=10"(Eb2N[i)/ID) ; % Eb/N in numeral 

Var_n=Lc/{2*Eb2N_num); %1/SKR is the noise variance 

signois=sqrt(Var_n); % standard deviation 

awgnois=signois*noiseq; % AWGN 

% Add noise to signals at the channel output 

y_out=x_in+awgnois; 

Y_out=reshape(y_out ( Lc,Ldata)♦ ' ; clear y_out awgnois; 

% Despread first and apply decorrelator Rinv 

z_out=[Y_out*pcode); % despreader [conventional} output 

clear Y_out; 

z_dcr=z_out*Rinv; % decorrelator output 

% Decision based on the sign of the single receivers 
decl=sign(real(z_out))+j *sign(imag(z_out)); 
dec2=sign(real(z_dcr)1+j*sign(imag(z_dcr)); 

% Now compare against the original data to compute BER of user 2 
% and user 4 (weaker ones!. 

BERa2=[BERa2;sum([real(data_sym(:,2}>~=real(decl(:,2));.,. 

imag(data_sym(:,2) )~=imag(decl(: H 2))])/(2 *Ldata)] ■ 

BERa4=[BERa4;sum([real(data_sym(: t 4))"=real(decl( : , 4) ) ; . . , 
imag(data_sym( :, 4) )~=imag(decl(:,4))])/(2 *Ldata)]; 

BERb2 =[BERb2;sum([real(data_sym( \ , 2 )) ~=real(dec2(:,2)) ; . . . 

imag(data_sym(:,2)>~=imag(dec2(:,2))]>/(2 *Ldata)]; 

BERb4=[BERbd;sum([real(data_sym( ; , 4) }~=real(dec2( : , 4) ) ; . . . 

imag(data_sym(:,4 j )~=imag(dec2(: H 4))J)/(2 *Ldata)]; 

BER_az=[BER_az;0.5*erfc(sqrt(Eb2N_num))]; %analytical 

end 

figure{1) 

figber=semilogy(Eb2N,BER_az,' k- ' ,Eb2N,BERa2,'k-o',Eb2N,BERa4, ' k-s',... 

Eb2N, BERb2 f 'k--o' , Eb2N, BERb4 , 'k-s' ) ; 
legend{'Single-user (analysis)','User 2 (single user detector)',,.. 

'User 4 (single user detector)', r User 2 (decorrelator) 1 ,* 

'User 4 (decorrelator)') 
axis( [0 12 0.99e-5 1 .e0]) ; 
set(figber,'LineWidth',2); 

xlabel('E_b/N (dB)');ylabel{'QPSK bit error rate') 
title('Weak-user BER comparisons'); 

We also implement the decision feedback MUD of Sec, 11.7 in MATLAB program Exll_4b .m. 

The decision feedback MUD performance of the two users is shown in Fig, 11,27. 

% MATLAB PROGRAM <Exll_4b,m> 

% This program provides simulation for multiuser CDMA 
% systems. The 4 users have different powers to illustrate the 
% near-far effect in single user conventional receivers 
% 

% Decision feedback detectors are tested to show its 
% ability to overcome the near-far problem. 

% 

%clear;elf 

Ldata=100000; % data length in simulation; Must be divisible by 3 

Lc=31; % spreading factor vs data rate 

%User number = 4; 

% Generate QPSK modulation symbols 
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Figure 11.27 

Performance 
comparison of 
decision 
feedback MUD 
in comparison 
with the 
conventional 
singfe-user 
receiver. 



data_sym=2*round(rand(Ldata,4))-1 + j*(2*round (rand(Ldata,4}}-1) ; 

% Select 4 spreading codes (Gold Codes of Length 11) 

golcOlcode; 

pcode=GPN; 

% Spreading codes are now in matrix pcode of 31x4 
PowerMat=diag[sqrt([10 1 5 1] ) ) ; 
pcodew=pcode*PowerNat; 

Rcor=pcodew'*pcodew; 

% Now spread 

x_in=kron(data_sym(:, 1 ) H pcodew{: , 1 } )+kron[data_sym(:,2),pcodew(;,2 )) + 
kron(data_sym(:,3),pcodew{;,3))+ kron(data_sym(:,4},pcodew(:,4)); 

% Signal power of the channel input is 2*Lc 

% Generate noise (AWGN) 

noiseq=randn(Ldata*Lc,1)+j*randn(Ldata*Lc,1); % Power is 2 

BER_c2=[]; 

EER2=[]; 

BER_c4=[]; 

BER4=[]; 

BER_a z = [] ; 


for 1=1:13, 

Eb2N{i) = (i-1) ; 
Eb2N_num=l{T (Eb2W(i) /10) ; 
Var_n-Lc/(2*Eb2N_num); 
signois^sqrt(Var_n); 


%(Eb/N in dB) 

% Eb/N in numeral 
%1/SNR is the noise variance 
% standard deviation 
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awgnois~signois*noiseq; % AWGN 

% Add noise to signals at the channel output 
y_out=x_in+awgnois; 

Y_out=reshape(y_out,Lc,Ldata).'; clear y_out awgnois; 

% Despread first 

z_out=Y_outOpcode; % despreader (conventional) output 

clear Y__out; 

% Decision based on the sign of the single receivers 
dec^sign(real(z_out))+j*sign(imag(z_out)); 

% Decision based on the sign of the samples 

decl=sign(real(z_out(:,1)) )+j *sign(imag(z_out(;,1})) ; 

Z_fkl=Z_Out-decl*Rcor(1,:); 

dec3 = sign(real(z_fkl( : , 3) ) }+j*sign(imag(z_f kl(:,3 > >} ; 
z_fk2=z_fkl-dec3*Rcor(3 , : ) ; 

dec2 = sign(real(z_fk2(:,2)) }+j *sign(imag(z_fk2(:,2))); 
z_fk3=z_fk2-dec2*Rcor(2 ,:); 

dec4 = sign(real(z_£k3( : , 4) ) } + j *sign(imag(z_fk3(:,4))>; 

% Now compare against the original data to compute BER 
BER_c2=[BER_c2;sum([real(data_sym[:,2})~=real(dec( : , 2) ) ; . . . 

imag(data_sym(:,2))"=imag(dec(:,2})]}/(2*Ldata)]; 

BER2 =[BER2 ;sum([real(data_sym(:,2))~ = real(dec2) ; . . . 

imag(data_sym{ : , 2) )~=imag(dec2)])/(2 *Ldata)]; 

BER_c4=[BER_c4;sum([real[data_sym(:,4))~=real(dec(: H 4) ) ; . . . 

imag(data_sym( : , 4) )~=imag[dec(:,4})])/(2*Ldata)]; 

BER4 =[BER4;sum[[real(data_sym[:,4>)~=real(dec4) ; • , . 

imag(data_sym( : , 4) )~=imag(deed)])/(2*bdata)]; 

BER_az-[BER_az;0.5 *erfc(sqrt(Eb2N_num))] ; ^analytical 

end 

clear z_fkl z_fk2 z_fk3 decl dec3 dec2 dec4 x_in y_out noiseq; 
figure(1) 

figber = semilogy(Eb2N,BER_az, 'k-',Eb2N,BER_c2, 'k-o' H Eb2N,BER_c4, 'k-s ', , 
Eb2N,BER2, 'k—o' ,Eb2N,BER4, 'k--S'); 

legend('Single-user (analysisUser 2 (single user detector}',... 
'User 4 (single user detector)','User 2 (decision feedback)',... 
'User 4 (decision feedback)'} 
axis([0 12 0.99e-5 I.eOJ); 
set(figber,'LineWidth', 2) ; 

xlabel ( ' E_b/N (dB) ' ) ;ylabel ( 'QPSK bit error rate') 
title['Weak-user BER comparisons'); 
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PROBLEMS 

11*1-1 Consider a fast hopping binary'ASK system. TheAWGN spectrum equals X n (/> = 10 -ti and the 
binary signal amplitudes are 0 and 2 V, respectively. The ASK uses a data rate of 100 kbit/s and 
is detected noncoherently. The ASK requires 100 kHz bandwidth for transmission. However, the 
frequency hopping is over 12 equal ASK bands with bandwidth totaling 1.2 MHz. The partial 
band jammer can generate a strong Gaussian noise-like interference with total power of 27 dBm, 

(a) If a partial band jammer randomly jams one of the 12 FH channels, derive the BER of the 
FH-ASK if the ASK signal hops 6 bands per bit period. 

fb) If a partial band jammer randomly jams two of the 12 FH channels, derive the BER of the 
FH-ASK if the ASK signal hops 6 hands per bit period, 

(c) If a partial band jammer jams all 12 FH channels, derive the BER of the FH-ASK if the 
ASK signal hops 6 bands per bit period. 

11.1- 2 Repeat Prob. 11.1-1 if the ASK signal hops 12 bands per bit period. 

11.1- 3 Repeat Prob. 11,1-1 if the ASK signal hops one band per bit period, 

11-2-1 In a multiuser FHSS system that applies BFSK for each user transmission, consider each inter¬ 
fering user as a partial band jammer. There are M users and L total signal bands for synchronous 
frequency hopping. The desired user under consideration hops L h bands within each bit period. 
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(a) Find the probability that exactly 1 of the signal bands used by the desired user during a 
signal bit is jammed by the interfering signals. 

(b) Determine the probability that none of the signal bands used by the desired user during a 
signal bit will be jammed by the interfering signals. 

(t) Assume that when a partial signal band is jammed, we can compute the BER effect by 
discarding the signal energy within the jammed band. Find the BER of a given user within 
the system. 

11.4-1 Let the AWGN noise n(f) have spectrum A r /2. If the AWGN noise n(r) is ideally band-limited 
to \/2T c Hz, show that if the spreading signal c(t) has autocorrelation function 

J? £ .(r) = £a(r - i ■ LT C ) 
i 

then the PSD of x(0 = n(f)c(f) is approximately 

■*x(/) = P S a (v)Sdf ~ v)dv = ^ 

J-2Q ^ 

11*5-1 C on s i der DS S S s y ste m s w i th i nterfere ncc si gnal / ( t ), At the receiver, the despread si g n al c ( t ) = 
±1 with bandwidth B c . 

(a) Show that i(t) and the despread interference 

ia(t) = KOcU) 

have identical power 

(h) If i(f) has bandwidth fl* and the spreading factor is L such that B c = L ’ show that the 
power spectrum of G(0 is L times lower but L times wider. 

11.6- 1 In a multiuser CDMA system of D$$S, all transmitters are at equal distance from the receivers. 

In other words, gi = constant. The additive white Gaussian noise spectrum equals S n (/) = 
5.x 10 BPSK is the modulation format of all users at the rate ofl 6 kbit/s. 

(a) If the spreading codes are all mutually orthogonal, find the desired user signal power Pi 
required to achieve BER of 10 -5 . 

(b) If the spreading codes are not orthogonal, more specifically, 

*/; = ! = - 1/16 i£j 

Determine the required user signal power to achieve the same BER of 10“ 5 by applying Gaussian 
approximation of the nonorthogonal MAI. 

11.6- 2 Repeat Prob, 11,6-1, if one of the 15 interfering transmitter is 2 times closer to the desired 

receiver such that its gain is 4 times stronger. 

11*7-1 For the multiuser CDMA system of Prob. 11,6-3, design the corresponding decorrelator and the 
MMSE detectors. 



O DIGITAL COMMUNICATIONS 
Z UNDER LINEARLY DISTORTIVE 
CHANNELS 


I n our earlier discussion and analysis of digital communication systems, we have made 
the rather idealistic assumption that the co mm uni cation channel introduces no distortion. 
Moreover, the only channel impairment under consideration has been additive white Gaus¬ 
sian noise <AWGN). In reality, however, communication channels are far from ideal. Among 
a number of physical channel distortions, multipath is arguably the most serious problem 
encountered in wireless communications. In analog communication systems, multipath repre¬ 
sents an effect that can often be tolerated by human ears (as echos) and eyes (as shadows). Tn 
digital communications, however, multipath leads to linear channel distortions that manifest as 
intersymbol interferences (ISI). This is because multipath leads to multiple copies of the same 
signal arriving at the receiver with different delays. Thus, one symbol pulse is delayed, which 
affects one or more adjacent symbols, causing ISI. As we have discussed, ISI can severely 
affect the accuracy of the receivers. To combat the effects of ISI due to multipath channels, 
we discuss, in this chapter, two highly effective tools: equalization and OFDM (orthogonal 
frequency division modulation). 


12.1 LINEAR DISTORTIONS OF WIRELESS 
MULTIPATH CHANNELS 

Digital communication requires that digital signals be transmitted over a specific medium 
between the transmitter and the receiver. The physical media (channels) in real world are 
analog. Because of practical limitations, however, analog channels are usually imperfect and 
can introduce unwanted distortions. Examples of nonideal analog media include telephone 
lines, coaxial cables, underwater acoustics, and radio-frequency (RF) wireless channels at 
various frequencies. Figure 12.1 demonstrates a simple case in which transmission from a 
base station to a mobile unit encounters a two-ray multipath channel: one ray from the line-of- 
sight and one from the ground reflection. At the receiver, there are two copies of the transmitted 
signal, one of which is a delayed version of the other. 


666 
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Figure 12.1 

Simple illus¬ 
tration of a 
two-ray multipath 
channel. 



To understand the effect of multipath in this example, we denote the line-of-sight signal 
arrival and the reflective arrival, respectively, as 

s(t) = /?i(Ocos co c t and a ].?(; — ri) = aim{t — T])co$ <o c (t — ri) 

Here we assumed that the modulation is DSB with PAM message signal (Chapter 7) 

m(t) = a k p(i - kT) 
k 

where 7 is the PAM symbol duration. Note also that we use a] and x \, respectively, to represent 
the multipath loss and the delay relative to the line-of-sight signal. Hence, the receiver RF input 
signal is 

r(0 - m(t) cos oj c t + a\m(t - t\) cos <o c {t - T[) + n £ (f)cos co c t 4-n s (f) sin c o c t (12.1) 

In Eq. (12.1), n c (f) and n$(t) denote the in-phase and quadrature components of the bandpass 
noise, respectively (Sec. 9.9). By applying coherent detection, the receiver baseband output 
signal becomes 

y{f) = LPF{2r(r) cos 

= m(t) + of] (cos w c T] )m(r - rj) + n c (t) (l2.2a) 

= - kT) + (ot\ ■ cos co c i } ) ^a k p(t - kT - n) -Fn^f) 

k k. 

= 71 ak ^ - *r) + («1 cos (D c T\)p(t -kT - tO] + n c (f) (12.2b) 

k 

By defining a baseband waveform 

q(t) =p(t) + (ai cos (t> c T\)p{t - n) 
we can simplify Eq. (12.2b) 


- *0 +M0 (12.2c) 

k 

Effectively, this multipath channel has converted the original pulse shape p{t) into q{t). If p(t) 
was designed (as in Chapter 7) to satisfy Nyquist’s first criterion of zero ISI, 


p(nT) — 


1 n = 0 

0 n = ± 1, ±2, ... 


then the new pulse shape q(t) will certainly have ISI as 

q(nT) = p{nT) + (ofj ■ cos <u r Z))p(nT — ri) ^ 0 


n = ± 1, ±2, ... 
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To generalize, if there are K -f 1 different paths, then the effective channel response is 

K 

q(t) = p(t) + ^ |ttj cos o) c Zi] pit - if) 

i= I 

in which the line-oTsight path delay is assumed to be to = 0 with unit path gain = 1. 
The ISI effect caused by the K summations in qit) depends on (a) the relative strength of the 
multipath gains {a f }; and (b) the multipath delays {tt e ). 

General QAM Models 

For conserving bandwidth in both wire-line and wireless communications, QAM is an efficient 
transmission. We again let the QAM symbol rate be 1 /T and its symbol duration be T. Under 
QAM, the data symbols {i;} are comp lex-valued, and the quadrature bandpass RF signal 
transmission is 


s(t) = 


^Re[sO/>(* - kT) 


L k 


COS (i> c t -f 


- kT) 


L k 


sin c& c t (12.3) 


Thus, under multipath channels with K + 1 paths and impulse response 


<5(0 4- - t i) 


i= 1 


the received bandpass signal for QAM is 

K 

r(0 = sit) + - n) + n c it) cos oj c t + n,(f) sin w c t (12.4) 

i=\ 

Applying coherent detection, the QAM demodulator has two baseband outputs 
LPF {2r(f)cos and LPF{2r(0 sin oj c t}. These two (in-phase and quadrature) outputs are 
real-valued and can be written as a single complex-valued output: 


y(0 = LPF{2r(0 cos co c t } -f j ■ LPF{2r(0 sin co c t) 

“ K 

5^(ar cos w c Ti)p(t -kT - r,) 


(12.5a) 


u=o 


k 

k 

-j ■ 

k 


L /=0 


£(“/ Sin a>cTi)p(t -kT- n) 
o 

K 

sin ~kT - t i) 

i =0 
K 

Y.to COS 0J c Ti)p(t -kT - T/) 


L/=o 


+ n e (f) +jn s (t) 


J2 Sk 


k L/=0 


Y a i exp (-jco c Ti)p(t -kT- I i) 


+ n c (t) +jn s (t) 


(12.5b) 
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Once again, we can define a baseband (complex) impulse response 

K 

q(0 = txp(ja) c Tj)p(t - kT - T f ) (12.6a) 

i=0 


and the baseband complex noise 


n e (t) = Mr) + / n, F (r) (12.6b) 

The receiver demodulator output signal at the baseband can be then written simply as 

v(0 = ^2s k q(t - kT) -hiv(r) (12.7) 

k 

in which all variables are complex-valued. Clearly, the original pulse p(t) that was designed to 
be free of IS1 has been transformed by the multipath channel route into q(t). In the frequency 
domain, we can see that 


K 

Q(f) = £>, ex PHW - ■ P(f) 

i=0 


( 12 . 8 ) 


This means that the original frequency response P(f) encounters a frequency-dependent 
transfer function because of multipath response 

a: 

^ a, exportt;) exp [-y'2?r/r, J 

i=0 

Therefore, the channel distortion is a function of the frequency/. Communication channels 
that introduce frequency-dependent distortions are known as frequency-selective channels. 
Frequency-selective channels can exhibit substantial ISI, which can lead to significant increase 
of detection errors. 

Wire-Line ISI 

Although we have just demonstrated how multipath in wireless communications can lead to 
ISI and linear channel distortions, wire-line systems are not entirely immune to such problems. 
Indeed, wire-line systems do not have a multipath environment because all signals are trans¬ 
mitted by dedicated cables. However, when the cables have multiple unused open terminals, 
impedance mismatch at these open terminals can also generate reflective signals that will arrive 
as delayed copies at the receiver terminals. Therefore, ISI due to linear channel distortion can 
also be a problem in wire-line systems. Cable internet serv ice is one example. 

Equalization and OFDM 

Because ISI channels lead to serious signal degradation and poor detection performance, 
their effects must be compensated either at the transmitter or at the receiver. In most cases, 
transmitters in an uncertain environment are not aware of the actual conditions of propagation. 
Thus, it is up to the receivers to identify the unknown multipath channel q(t) and to find 
effective means to combat the TSL The two most common and effective tools against ISI 
channels are channel equalization and OFDM. 
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Figure 12,2 

Baseband 
representation of 
QAM trans¬ 
mission over a 
linear time' 
invariant channel 
with l$l 


Input 

LTI channel 

^ Output 
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£?(0 
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' y(t) 


n*<0 
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12.2 RECEIVER CHANNEL EQUALIZATION 

It is convenient for us to describe the problem of channel equalization in the stationary channel 
case. Once the fundamentals of linear time-invariant (LT1) channel equalization is understood, 
adaptive technology can handle time-varying channels. 

When the channel is LTI, we use the simple system diagram of Fig. 12.2 to describe the 
problem of channel equalization. In general, channel equalization is studied for the (spectrally 
efficient) digital QAM systems. The baseband model for atypical QAM (quadrature amplitude 
modulated) data communication system consists of an unknown LTI channel q{t ), which 
represents the physical interconnection between the transmitter and the receiver in baseband. 

The baseband transmitter generates a sequence of comp l ex-valued random input data 
each element of which belongs to the constellation A of QAM symbols. The data sequence 
{sjt} is sent through the baseband channel that is LTI with impulse response q{t). Because 
QAM symbols {s*} are complex-valued, the baseband channel impulse response q(t) is also 
complex-valued in general. 

Under the causal and complex-valued LTI communication channel with impulse response 
q{t X the input-output relationship of the QAM system can be written as 


oo 

>’(f) = ^ Skq(t -kT + to) 4- n*(f) a* e A (12.9) 

kss-OO 

Typically the baseband channel noise n c (0 is assumed to be stationary, Gaussian, and inde¬ 
pendent of the channel input s*. Given the received baseband signal y(f) at the receiver, the 
job of the channel equalizer is to estimate the original data {^} from the received signal y(r). 

In what follows, we present the common framework within which channel equalization 
is typically accomplished. Without loss of generality, we let t 0 — 0. 


12.2.1 Antialiasing Filter vs. Matched Filter 

We showed in Secs. 10.1 and 10.6 that the optimum receiver filter should be matched to the total 
response q(r). This filter serves to maximize the SNR of the sampled signal at the filter output. 
Even if the response q{t) has ISI, Forney 1 has established the optimality* of the matched filter 
receiver, as shown in Fig. 123. With a matched filter q{—t) and symbol (baud) rate sampling 
at t — nT , the receiver obtains an output sequence relationship between the transmitter data 
{jjt} and the receiver samples as 


z \n\ = ^ s k h{nT - kT) 

k 


( 12 . 10 ) 


* Forney proved 1 that sufficient statistics for input symbol estimation is retained by baud rate sampling at t — nT of 
matched filter output signal. This result forms the basis of the well-known single-input-single-output (SISO) system 
model obtained by matched filter sampling. However, when <?(r) is unknown, the optimality no longer applies. 
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Figure 12.3 

Optimal matched 
Filter receiver. 
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where 


h(t) = q(t) * q(-t) 


(12.11) 


If we denote the samples of MO 


h\n\ = h{nt) 


then Eq. (12.10) can be simplified 

z{n\ = ^skhln -k] = h[n] (12.12) 

ft 

In short, the channel (input-output) signals are related by a single-input-single-output (SISO) 
linear discrete channel with transfer function 

H{z) = (12.13) 

tJ 

The SISO discrete representation of the linear QAM signal leads to the standard 7-spaced 
equalizer (TSE). The term T-spaced equalization refer to processing of the received signal 
sampled at the rate of 1 /7\ Therefore, the time separation between successive samples equals 
the baud (symbol) period 7. 

The optimal matched filter receiver faces a major practical obstacle that the total pulse 
shape response q{t) depends on the multipath channel environment. In reality, it is practically 
difficult to adjust the receiver filter according to the time-varying q{t) because channel environ¬ 
ment may undergo significant and possibly rapid changes. Moreover, the receivers generally 
do not have a priori information on the channel that affects q(t). As a result, it does not make 
sense to implement the optimum receiver filter q{—t) in a dynamic channel environment. It 
makes better sense to design and implement a time-invariant receiver filter. Therefore, the 
important task is to select a receiver filter without losing any signal information in y(f). 

To find a solution, recall the QAM channel input signal 

x(t) = ^ Skp(t-kT) 
k 

We have learned from Section 7.2 [see Eq. (7.9) J that the power spectral density of an amplitude- 
modulated pulse train is 


5 Jt (/) = |P(f)| 2 l 


RAn]e~i ,a *f r 

jFl= — OQ 


(12.14a) 


= Jjt+nS* 


(12.14b) 


by simply substituting the pulse amplitude a * with the QAM symbol s K . The signal spectrum 
of Eq. (12.14a) shows that the signal component in y(r) is limited by the bandwidth of pit) 
or P(f). 
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Figure 12.4 
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Therefore, the receiver filter must not filter out any valuable signal component and should 
have bandwidth equal to the bandwidth of P(f). On the other hand, if we lei the receiver 
filter have a bandwidth larger than P(f), then more noise will pass through the filter, with no 
benefit to the signal For these reasons, a good receiver filter should have bandwidth exactly 
identical to the bandwidth of P(f). Of course many such filters exist One is the filter matched 
to the transmission pulse p(t) given by 

Pi-0 <=► P*if) 

Another consideration is that, if the channel introduces no additional distortions, then q{t) = 
pit). In this case, the optimum receiver would be the filter p(-t) matched to p(f). Consequently, 
it makes sense to select p(-t) as a standard receiver filter (Fig. 12.4) for two reasons; 

(a) The filter p(-t) retains all the signal spectral component In the received signal y(f). 

(b) The filter p{—t) is optimum if the environment happens to exhibit no channel distortions. 

Therefore, we often apply the receiver filter/?(—r) matched to the transmission pulse shape 
pit). This means that the total channel impulse response consists of 

KO - q{t) *p(-t) 

Notice that because of the filtering z(t) = p{—t) * y(f). The signal z{t) now r becomes 

*<f) = £ - kT > + w(0 (12.15) 

k 

in which the filtered noise term w(f) arises from 

w(f) = p(—t) * n e (i) (12.16) 


with power spectral density 


S w (f) = \P(f)\ 2 S n Jf) 

Finally, the relationship between the sampled output z[k] and the communication symbols ^ is 

z[n] = ^2 h[n - k]$t: +w[/i] 

it 

= £M*Js«-* + w [n] (12.17) 

it 

where the discrete noise samples are denoted by w[tf] = w (nT). 
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Generally, there are two approaches to the problem of channel input recovery (i.e., equal¬ 
ization) under ISI channels* The first approach is to determine the optimum receiver based 
on channel and noise models. This approach leads to maximum likelihood sequence estima¬ 
tion (MLSE), which is computationally demanding. A low-cost alternative is to design filters 
known as channel equalizers to compensate for the channel distortion. In what follows, we 
first describe the essence of the MLSE method for symbol recovery. By illustrating its typi¬ 
cally high computational complexity, we provide the necessary motivation for the subsequent 
discussions on various complexity channel equalizers. 


12.2.2 Maximum Likelihood Sequence Estimation (MLSE) 

The receiver output samples {zM} depend on the unknown input QAM symbols {s n \ according 
to the relationship of Eq. (12.17). The optimum (MAP) detection of [s n ] from {z[rcl} requires 
the maximization of joint conditional probability [Eq. (10.81)J: 


max /?(..*, j, $}i-> j ■ - ♦ 


...*z[n - 11, z[nl z[n + 11, *..) 


( 12 . 18 ) 


Unlike the optimum symbol-by-symbol detection for AWGN channels derived and analyzed 
in Sec. 10*6, the interdependent relationship in Eq* (12.17) means that the optimum receiver 
must detect the entire sequence } from a sequence of received signal samples {z[n\) ■ 

To simplify this optimum receiver, we first note that in most communication systems and 
applications, each QAM symbol s f} is randomly selected from its constellation A with equal 
probability. Thus, the MAP detector can be translated into a maximum likelihood sequence 
estimation (MLSE); 


max 

l^i \ 


p[ *r z.[n - 11, z[nl z[n + 1], 


1, Sn, lS w _|_[ 




(12.19) 


If the original channel noise n e (t) is white Gaussian, then the discrete noise w[>i] is also 
Gaussian because Eq. (12.16) shows that w(f) is filtered output of n.. {t ). In fact, we can define 
the power spectral density of the white noise n s (/) as 

*,(/> = y 

Then the power spectral density of the filtered noise w(/) is 

SwCO = irV')\ 2 Sn e (f) = ~ |E(/')| 2 (12.20) 

From this information, we can observe that, the autocorrelation function between the noise 
samples is 


= w[£ + nlw*[n] 


= w(fT + nT’)w*(«D 

/ CO 

SvfDe-W^df 

'tXJ 

= y f" \P(J)\ 2 e-W"df 

- J — oc- 


( 12 . 21 ) 
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Tn general, the autocorrelation between two noise samples in Eq. (12.21) depends on the 
receiver filter which is, in this case, p(—t). In Sec. 7.3, the ISI-free pulse design based on 
Nyquist* s first criterion is of particular interest. Nyquist’s first criterion requires that the total 
response from the transmitter to the receiver be free of intersymbol interferences. Without 
channel distortion, the QAM system in our current study has a total impulse response of 

Pit) *p(-t) \P{f )\ 2 

For this combined pulse shape to be free of TSI, we can apply the first Nyquist criterion in the 
frequency domain 


[_ 

T 




2 

= 1 


This is equivalent to the time domain requirement 


pit) *p(-t) 


1 l = 0 

0 l = ±L ±2, 


(12.22a) 


(12.22b) 


In other words, the Nyquist pulse-shaping liller is equally split between the transmitter and 
the receiver. According to Eq. (12.22a), the pulse-shaping frequency response P(f ) is the 
square root of a pulse shape that satisfies Nyquist’s first criterion in the frequency domain. 
If the raised-cosine pulse shape of Section 7.3 is adopted, then P(f) would be known as the 
root-raised-cosine pulse. For a given roll-off factor r, the root-raised-cosine pulse in the time 
domain is 





(12.23) 


Based on the IST-free conditions of Eq. (12.22b), we can derive from Eq. (12.21) that 

AT 


RAt] 


/: 


\Pif)\ e 


2„-]2xflT 


df 


A r 

- yp(r) 


t=lT 


,V 


t = 0 


(12.24) 


0 l = ±1, ±2, 


This means that the noise samples {w[n]{ are uncorrelated. Since the noise samples {w[;z]j are 
Gaussian, they are also independent. As a result, the conditional joint probability ofEq. (12.19) 
becomes much simpler 


lj, z[nl, zln + 1J, 


■ ■ t 11 ‘Sm ■Sn+lt ■ ■ •) 
t $n— 1, S/ti Jm+I> * ’ ■) 


(12.25) 
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Indeed, Eq. (12.24) tells us thaUfra -/] is Gaussian with equal variance J\f/2 and mean value of 


k 

Therefore, the MLSE optimum receiver under Gaussian channel noise and root-raised-cosine 
pulse shape jp rre (/) [Eq. (12.23)]. 


max In 


]~[p (z[h - i'l 


$n— 1* ^’n+t 




max 

1 


y 


z[n - t] - yh[k]s n -i- 


(12.26a) 


Thus, MLSE is equivalent to 


min V 

Uff] “ 


z[n - i] ~ 


(12.26b) 


For a vast majority of communication channels, the impulse response h\k] can be closely 
approximated as a finite impulse response (FIR) filter of some finite order. If the maximum 
channel order is L such that 


L 

H{z) = y j h[k}z~ k 


k =0 


then the MLSE receiver needs to solve 


min T 


Z[n 


k =0 


(12.27) 


We note that the MLSE algorithm requires that the receiver possess the knowledge of the dis¬ 
crete channel coefficients {/?[£]}< When exact channel knowledge is not available, the receiver 
must first complete the important task of channel estimation. 


MLSE Complexity and Practical Implementations 

Despite the apparent high complexity of the MLSE algorithm [Eq. (12 t 27)J, there exists a 
much more efficient solution given by Viterbi 2 based on the dynamic programming principle of 
Bellman. 3 This algorithm, often known as the Viterbi algorithm, does not have an exponentially 
growing complexity as the data length grows. Instead, if the QAM constellation size is M, 
then the complexity of the Viterbi algorithm grows according to M l . The Viterbi algorithm is a 
veiy powerful tool, particularly when the channel order L is not very long and the constellation 
size M is not huge. The details of the Viterbi algorithm will be explained in Chapter 14 when 
we present the decoding of convolutional codes. 

MLSE is very common in practical applications. Most notably, many GSM cellular 
receivers perform the MLSE detection described here against multipath distortions. Because 
GSM uses binary constellations in voice transmission, the complexity of the MLSE receivers 
is reasonably low for common cellular channels that can be approximated as FIR responses of 
order 3 to 8. 
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Figure 12*5 

A SISO discrete 
linear channel 
model for TSE. 



On the other hand, the modulation formats adopted in high-speed dial-up modems are 
highly complex. For example, the V.32bis (14.4 kbit/s) modem uses a trellis-coded QAM con¬ 
stellation of size 128 (with 64 distinct symbols) at the symbol rate of 2400 baud (symbols/s). 
In such applications, even a relatively short L — 5 FIR channel would require MLSE to have 
over 1 billion states. In fact, at higher bit rates, dial-up modems can use size 256 QAM or even 
size 960 QAM. As a result, the large number of states in MLSE makes it completely unsuit¬ 
able as a receiver in such systems. Consequently, suboptimal equalization approaches with 
low complexity are much more attractive. The design of simple and cost effective equalizers 
(deployed in applications including voiceband dial-up modems) is discussed next. 

12.3 LINEAR T-SPACED EQUALIZATION (TSE) 

When the receiver filter is matched to the transmission pulse£>(0 only, it is no longer optimum,* 
Even if the ideal matched filter q(-t) is known and applied, it is quite possible in practice for 
the sampling instant to have an offset /q such that the sampling takes place at t = nT + to* 
Such a sampling offset is known as a timing error . When there is a timing error, the receiver 
is also not optimum. It is in fact commonplace for practical communication systems to have 
unknown distortive channels and timing jitters. Nevertheless, T -spaced equalization is simpler 
to implement. Here we discuss the fundamental aspects of TSE design. 

Because T-spaced sampling leads to a simple discrete time linear system Eq. (12.17) as 
shown in Fig. 12.5, the basic linear equalizer is simply a linear filter F(z) followed by a direct 
QAM decision device. The operational objective of the equalizer (filter) F(z) is to remove as 
much ISI as possible from its output d[ri\* We begin our discussion on the T -spaced equalizer 
(TSE) by denoting the (causal) equalizer transfer function 

F(z) = 

i 

If the channel noise w[/i] is included, the TSE output is 

d[n\ = F(z)z.[n\ = F{z)H(z)s n + F(z)w[n] (12.28) 

signal term noise term 

We denote the joint channel equalizer transfer function as 

oo 

C(z) = F(z)H(z) = '£ i c *- i 

1=0 

The goal of the equalizer F(^) is to clean up the ISI in d[n] to achieve an error-free decision 

= dec (d\n]) = s n -u (12.29) 


The sufficient statistics shown by G. D. Forney 1 are not necessarily retained. 
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where u is a fixed delay in the equalizer output. Because both the channel and the equalizer 
must be causal, the inclusion of a possible delay u provides opportunities for simpler and better 
equalizer designs. 

To better understand the design of the TSE filter F(z), we can divide the TSE output into 
different terms 


OC 00 

d[ri] = ^asn-i + y]/[zlw[rt - i] 

i=0 (=0 

OO 00 

- C u i -,1 -M + ^ CiSfi—i + y>z]w[« - z] (12.30) 

i =0 

ISI term noise term 

The equalizer filter output d[n\ consists of the desired signal component with the right delay, 
plus the IS! and noise terms. If both the ISI and noise terms are zero, then the QAM decision 
device will always make correct detections without any error. Therefore, the design of this 
linear equalizer filter F(z) should aim to minimize effect of the ISI and the noise terms. In 
practice, there are two very popular types of linear equalizer: zero-forcing (ZF) design and 
minimum mean square error (MMSE) design. 

12.3.1 Zero-Forcing TSE 

The principle of zero-forcing equalizer design is to eliminate the ISI term without considering 
the noise effect. In principle, a perfect ZF equalizer F(z) should force 

^ ^ — 0 

i=0, 

In other words, all ISI terms are eliminated 

Equivalently in frequency domain, the ZF equalizer requires 

C(z) = F(z)H(z) = r“ (12.31b) 

Notice that the linear equalizer F(z) is basically an inverse filter of the discrete ISI channel 
H (;) with appropriate delay w 


F(z) 


H(z) 


(12.31c) 


If the ZF filter of Eq. (12.31 c) is causal and can be implemented, then the ISI is completely 
eliminated from z[n]. This appears to be an excellent solution, since the only decision that the 
decision device now must make is based on 


z[n] = s n - u +F(z)w[/j] 
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without any 1ST One major drawback of the ZF equalizer lies in the remaining noise term 
F(z)wLttJ- if the noise power in z[n\ is weak, then the QAM decision would be highly accurate. 
Problems arise when the transfer function F{z) has strong gains at certain frequencies. As a 
result, the noise term F{z)w[n\ may be amplified at those frequencies. In fact, when the 
frequency response of H(z) has spectral nulls, that is, 

H{^°) — 0 for some a> 0 e [0, ;r] 

then the ZF equalizer F{z.) at cd 0 would have infinite gain, and substantially amplify the noise 
component at co 0 . 

A different perspective is to consider the filtered noise variance. If w [n] are independent 
identically distributed (i.i.d.) Gaussian with zero mean and variance Af/2, then the filtered 
noise term equals 


Of 

w[n] = F(zM«] = ^/[(]w[n - i] 

i=0 

Tfie noise term w[ri\ remains Gaussian with mean 


oo oc 

= ZY [, ' lw[ " - i] = 0 

i =0 1=0 


and variance 


- i] 

i -0 


i=i) 


Because the ZF equalizer output is 


z[n] = s„- u 4- w[n] 


the probability of decision error in dec (z[fl]) can therefore be analyzed by applying the same 
tools used in Chapter 10 (Sec. 10*6). In particular, under BPSK modulation, $ n = ± +fEh with 
equal probability. Then the probability of detection error is 


Pb = Q 


(A-DSoi/mi 2 ) 


(12.32) 


where the ZF equalizer parameters can be obtained via the inverse-Z transform 

'“-shf 


i 


F{z)z l ~ l dz 

,1-1-M 


2njJ H{z) 


■dz 


(12.33) 


If F has spectral nulls, then/[/] from Eq. (12.33) may become very large, causing a 
serious increase of Pi ,. 
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Example 12. 


1 Consider a first-order channel 

H(z) = 1 +r 2 

Determine the noise amplification effect on the ZF equalizer for a BPSK transmission. 

Because fUe * 7 *?) = 0 when/ = ± 1/4, it is clear that Hi :j has spectral nulls. By applying 
the ZF equalizer, we have 


m 


hn 


l-M 


:dz 


txj J i + ■ 

0 i < u 

(-1)'-" i > u 


Therefore, 


i/'[*]i 2 = = oc 


0 


This means that the BER of the BPSK transmission equals 

^ = 0(0) =0.5 

The noise amplification is so severe that the detection is completely random. 


Example 12.1 dearly shows the significant impact of noise amplification due to ZF 
equalization. The noise amplification effect strongly motivates other design methodologies 
for equalizers. One practical solution is the minimum mean square error (MMSE) design. 

12.3.2 TSE Design Based on MMSE 

Because of the noise amplification effect in ZF equalization, we must not try to eliminate the 
IS1 without considering the negative impact from the noise term. In fact, we can observe the 
equalizer output in Eq> (12.30) and quantify the overall distortion in d\n~\ by considering the 
difference (or error) 

00 oo 

- J/t-M = ^2 c i s »-i - s n-u + (1234) 

i =0 


To reduce the number of decision errors when 

dec (^/[fi]) s n — u 

it would be sensible to design an equalizer that would minimize the mean square error between 
d[n\ and s n - tt . In other words, the MMSE equalizer design should minimize 


I d[n]-s n - u \* 


(12.35) 
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Let us now proceed to find an equalizer filter that can minimize the mean square error of 
Eq. (12.35). Once again, we will apply the principle of orthogonality in optimum estimation 
(Sec. 8.5), that the error (difference) signal must be orthogonal to the signals used in the filter 
input. Because d[n] = — 'L we niust have 

(d[n] ~ ± z[n - £] £ = 0, L ... 

In other words. 


{d[n\ - s n ^ u ) z*\n — £] = 0 £ = 0, 1, .. . (12.36) 


Therefore, the equalizer parameters {/ [/]} must satisfy 


- i] - s n - u j z*[n -t] = 0 

Note that the signal s n and the noise w[rcl are independent. Moreover, {.t h J are also i,Ld + with 
zero mean and variance ^.Therefore, = 0, and we have 


Jn-I tz*[n - £\ = s»-u (^ J h[j]*s* n _ ] _ t + wfn - l]*) 


7=0 


= 'E h [}Ts»-''%-j- t + 0 

j=0 


E., 

0 


h[u-i]* Q<t<u 
i > u 


Let us also denote 


(12.37) 


/?£ [m] = z[n + m];*[n] 

Then the MMSE equalizer is the solution to linear equations 


(12.38) 


1=0 


E s h[u-if t = 0, 1, ..., u 

0 f H -|- 1, M -|- 2, ■■■■? OO 


Based on the channel output signal model, we can show that 


E z [m\ 


hiSr,+ m -i + w[n + m] 




[«] 


= E s '£ / h m+ jh* + j&[m] 

j=0 


(12,39) 


(12.40) 
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Minimum MSE and Optimum Delay 

Because of the orthogonality condition Eq. (1236), we have 


(d[n]-s n - u )d[n]*= 0 

Hence, the resulting minimum mean square error is shown to be 

MSE(u) = {s n - u - d[n\) s*_ H 
— Es (1 c u) 

= (12.41) 

It is dear that MMSE equalizers of different delays can lead to different mean square error 
results. To find the delay that achieves the least mean square error, the receiver can determine 
the optimum delay according to 


CXJ 

u 0 = arg max ^ hif[u - ;'] (12*42) 

/=o 


Finite Length MMSE Equalizers 

Because we require the equalizer F(z) to be causal, the MMSE equalizer based on the solution 
of Eq. (1239) does not have a simple dosed form. The reason is that (f[i]} is causal while 
is not. Fortunately, practical implementation of the MMSE equalizer often assumes the 
form of a finite impulse response (FIR) filter. When F(z) is FIR, the MMSE equalizer can be 
numerically determined from Eq + (12.39), Let 


M 

f(z) = 

i=0 

The orthogonality condition of Eq + (1239) then is reduced to a finite set of linear equations 


M 

'E/msjt - 1 ] = 


i=0 


E s h[u — £}* 

0 


t — 0, 1, ..,, u 

f = M+ l,W + 2, r . . , M 


(12.43a) 


Alternatively, we can write the MMSE condition into matrix form for u < M: 


A[0] 

*z[-l] 

#*[1] 

*z[0] 

R,[M] 

flJAf - 1] 


R z [-M] 
R z LI — M] 


^[0] 


" 

" m ' 


/[i] 


. /wi. 


= Es 


h[uT 
h\u — 1]* 


h[0r 

0 


M + 1 rows 


0 


(12*4 3b) 
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Of course, if the delay u exceeds M , then the right hand side of Eq. (12.43b) becomes 


-flJO] Rd-\] ■ 

*Ji] Je*[0] 

R z [-M 1 " 
■■ R z [l-M] 


" m 

f rn 


m* 

h[u — If 

R Z [M - 1] • 

Rz L0J 


_ f[M] _ 


_ h[u-M r _ 


The solution is unique so long as the autocorrelation matrix in Eq, (12,43c) has full rank. 

MMSE vs* ZF 

Note that if we simply set the noise spectral level to M = 0, the MMSE equalizer design of 
Eqs. (12.39) and (12,43c) is easily reduced to the ZF design. In other words, the only design 
change from MMSE to ZF is to replace R z [0] from the noisy to the noise-free case of 

DO 

**[0] = E s I h j\ 2 
j =o 

All other procedures can be directly followed to numerically obtain theZF equalizer parameters. 

It is important to understand, however, that the design of finite length ZF equalizers 
according to Eq, (12.43c) may or may not achieve the objective of forcing all 1ST to zero. In 
fact, if the channel H(z) has Unite order L, then ZF design would require 

M L 

F(z)H(z) = 

1=0 1=0 

This equality would be impossible for any stable causal equalizer to achieve. The reason is 
quite simple if we consider the basics of polynomials. The left-hand side is a polynomial of 
order M H- L. Hence, it has a total of M -b L roots, whose locations depends on the channel and 
the equalizer transfer functions. On the other hand, the right-hand side has a root at oo only. It 
is therefore impossible to fully achieve this zero-forcing equality. Thus, one would probably 
ask the following question; What would a finite length equalizer achieve if designed according 
to Eq . (12A3c)? 

The answer can in fact be found in the MMSE objective function when the noise is zero. 
Specifically, the equalizer is designed to minimize 


\d[n] - s„_ M | 2 = - s n - u \ 2 


when the channel noise is not considered. Hence, the solution to Eq. (12.43c) would lead to 
a finite length equalizer that achieves the minimum difference between F(z)H(z) and a pure 
delay z~ u . In terms of the time domain, the finite length ZF design based on Eq. (12.43c) will 
minimize the 1ST distortion that equals 


k„ - ii 2 + Y, M 2 

i¥* 


M 

h\u - i] -1 

i=0 


+53 


M 

mj - i] 

i=0 




12.3 Linear ^-Spaced Equalization (TSE) 683 

In other words, this equalizer will minimize the contribution of 1SI to the mean square error 
in 

Finite Data Design 

The MMS E (and ZF) des ign of Eqs. (12.39) and (12.43c) assumes statistical knowledge of 
R z [m] and s n - u z * [n — l). In practice, such information is not always readily available and 
may require real-time estimation. Instead, it is more common for the transmitter to send a short 
sequence of training (or pilot) symbols that the receiver can use to determine the optimum 
equalizer. We now describe how the previous design can be directly extended to cover this 
scenario. 

Suppose a training sequence {s n * n = + 1, ..., n 2 ) is transmitted. To design an FIR 

equalizer 


F(Z) =/[0] +/Uk -1 + ■ ■ ■ +m\z~ M 
we can minimize the average square error 


«2 “ 4 - 


—T E low-’.-.? 


n—u+n 


where 


M 


d[n\ = ^fUlzlrt - i] 


1= 0 


To minimize J , we can take its gradient with respect to/L/]. By setting the gradient to zero, 
we can derive the conditions required by the optimum equalizer parameters 


M 




-TT 13 -j] = -TT 


u+ni 


x y = 0, 1. M 

n=u-\-n | 

These M + 1 equations can be written more compactly as 


"£<[0,0] £jl,0] . 

■■ £jm,o]' 


' /[OJ “ 



£,[o,i] £ji,ii • 

■■ £jm,i] 


/[l] 

— 

^[ — w 4- 1] 

_£*[ 0,M] £,[1,Af] ■ 

R Z [M,M]_ 


. /["] _ 


_ /?$-[— u -f M\ _ 


(12.44) 


(12.45) 


where we denote the time average approximations of the correlation functions (for i, j = 

0 , 1 . M): 



684 


DIGITAL COMMUNICATIONS UNDER LINEARLY DISTORTIVE CHANNELS 


It is quite clear from comparing Eqs. (12.45) and (12.43c) that under a short training sequence 
(preamble), the optimum equalizer can be obtained by replacing the exact values of the corre¬ 
lation function with their time average approximations. If matrix inverse is to be avoided for 
complexity reasons, adaptive channel equalization is a viable technology. Adaptive channel 
equalization was first developed by Lucky at Bell Labs 4 ’ 5 for telephone channels. It belongs 
to the field of adaptive filtering. Interested readers can refer to the book by Ding and Li 6 and 
the references therein. 


12.4 LINEAR FRACTIONALLY SPACED 
EQUALIZERS (FSE) 

We have shown that when the channel response is unknown to the receiver, TSE is likely to 
lose important signal information. In fact, this point is quite dear from the sampling theory. 
As shown by Gitlin and Weinstein, 7 when the transmitted signal (or pulse shape) does have 
spectral content beyond a frequency of 1 i(2T) Hz, baud rate sampling at the frequency of 1/7 
is below the Nyquist rate and can lead to spectral aliasing. Consequently, receiver performance 
may be poor because of information loss. 

In most cases, when the transmission pulse satisfies Nyquist’s first criterion of zero ISI, 
the received signal component is certain to possess frequency content above 1/(27") Hz. For 
example, when a raised-cosine (or a root-raised-cosine) pulse prrdO is adopted with roll-off 
factor r fEq. (12.23)], the signal component bandwidth is 


For this reason, sampling at 1/7 will certainly cause spectral aliasing and information loss 
unless we use the perfectly matched filter #(—0 and the ideal sampling moments t = kT. 
Hence, the use of faster samplers has great significance. When the actual sampling period 
is an integer fraction of the baud period 7, the sampled signal under linear modulation can 
be equivalently represented by a single-in put-multiple-output (SIMO) discrete system model. 
The resulting equalizers are known as the fractionally spaced equalizers (or FSE). 


12.4.1 The Single-Input—Multiple-Output (SIMO) Model 

An FSE can be obtained from the system in Fig. 12.6 if the channel output is sampled at a rate 
faster than the baud or symbol rate 1 /7. Let m be an integer such that the sampling interval 
becomes A = 7 fm. In general, because of the (root) raised-cosine pulse has bandwidth B: 

1 1 + r l 

— < B = < - 

27 “ 27 “ 7 


Figure 12.6 

Fractionally 
spaced sampling 
receiver front 
end for FSE. 


>’(0 


Receiver filter 
P(~t) 


z(t) 


A- 


z(nA) 


t ~ rcA 
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Any sampling rate of the form 1/A = m/T (m > 1) will be above the Nyquist sampling rate 
and can avoid aliasing. For analysis, denote the sequence of channel output samples as 


z(k A) = ^^(fcA - nT) + w(fcA) 
h=G 
oo 

= ^s n /;(/:A — nm A) -f- w(£A) (12.46) 

m=0 

To simplify our notation, the oversampled channel output z(k A) can be reorganized (decimated) 
into m parallel subsequences 

Zi[k] = z{kT + (A) 

— z(kmA -f /A) 

oo 

= s n h(kmA + /A — nm A) + w(£mA -f rA). 

F1=0 

OC 

= ^2$ u h(kT - nT + iA) + w (kT + /A) i = 1, . .., m (12.47) 

n—Q 


Each subsequence z t [k] is related to the original data via 

Zi[k] = z(kT H- rA) = s* * h(kT + /A) + w (kT + iA) 

In effect, each subsequence is an output of a linear subchannel. By denoting each subchannel 
response as 


hi[k] ± h{kT + iA) <=> H l (z) = J^h i [k]z^ k 

k= 0 


and the corresponding subchannel noise as 

w,r*l = w (kT + iA) 

then the reorganized m subchannel outputs are 


oc 

Z<[*] = - «] + w i[k] 

n=0 

CO 

= +Wi[*J (=1. m (12.48) 

n=0 

Thus, these m subsequences can be viewed as stationary outputs of m discrete channels 
with a common input sequence s[k] as shown in Fig. 12.7. Naturally, this represents a single¬ 
in put-multiple-output (SIMO) system analogous to a physical receiver with m antennas. The 
FSE is in fact a bank of m filters {/q(z)} that jointly attempts to minimize the channel distortion 
shown in Fig. 12.7. 
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Figure 12*7 

Equivalent 
structure of 
fractionally 
spaced 

equalizers (FSE). 


RSE 



12.4.2 FSE Designs 

Based on the SIMO representation of the FSE in Fig, 12,7, one FSE filter is provided for each 
subsequence £;[£]♦ In fact, the actual equalizer is a vector of filters 

M 

Pi(z) = J2f^ z ~ k i= 1 . m (12.49) 

Jt=0 

The m filter outputs are summed to form the stationary equalizer output 

m M 

}W = (12.50) 

1 = 1 11=0 

Given the linear relationship between equalizer output and equalizer parameters, any TSE 
design criterion can be generalized to the FSE design. 

ZF Design 

To design a ZF FSE, the goal is to eliminate all ISI at the input of the decision device. Because 
there are now m parallel subchannels, the ZF filters should satisfy 

m 

C(z) = 2>i(z)tf;U) = (12.51) 

i=l 

This zero-forcing condition means that the decision output will have a delay of integer 

A closer observation of this ZF requirement reveals its connection to a well-known equality 
known as the Bezout identity. In the Bezout identity, suppose there are two polynomials of 
orders up to L. 


L L 

4i(z) = and A 2 (z) = ,;z 

;=o /=o 
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If M (z) and A 2 {z) do not share any common root, then they are called coprime. The Bezout 
identity states that if A\{z) and A 2 (z) are coprime, then there must exist two polynomials 


S](*) = X>UZ' 


M 


and 


Biiz) = ^hjz ' 


such that 


Bi(z)A { (z) + B 2 (z)A2(z)= 1 

The order requirement is that M > L- 1. The solution of B\(z) and B 2 (z) need not be unique. 
It is evident from the classic text by Kailath 8 that the ZF design requirement of Eq. (12.51) is 
an ^-channel generalization of the Bezout identity. To be precise, let f Hi(z), i = 1,2, ..., m] 
be a set of finite order polynomials of z ~ 1 with maximum order L. If the ^-subchannel transfer 
functions {//, (z)} are coprime, then there exists a set of filters {F)(z)} with orders M > L — 1 
such that 


^Fi(z)Hi(z) = z~ u 

i=[ 


(12.52) 


where the delay can be selected from the range u = 0, 1, ..., M + L - l. Note that the 
equalizer filters {F)(z)} vary with the desired delay u. Moreover, for each delay u , the ZF 
equalizer filters {F)(z)} are not necessarily unique. 

We now describe the numerical approach to finding the equalizer filter parameters. Instead 
of continuing with the polynomial representation in the z-domain, we can equivalently find 
the matrix representation of Eq. (12.52) as 
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(12.53) 


The numerical design as a solution to this ZF design exists if and only if H has full row 
rank, that is, if the rows of H are linearly independent. This condition is satisfied for FSE (i.e., 
m > l)ifM >Land {//,(£)} are coprime. 6 
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MMSE FSE Design 

We will apply a similar technique to provide the MMSE FSE design. The difference between 
FSE and TSE lies in the output signal 


m M 

('= I k =0 


To minimize the MSE | d[n — j n _ w | 2 , the principle of orthogonality leads to 


(d[n] - s n - H ) zj[n - i\ = 0 f. =0,1 .M, j= 1.w (12.54) 

Therefore, the equalizer parameters must satisfy 

m M 

^ / '^ l fi[k\zi[n-k]Zj[n-e]=s„- ll zJ[n-t] £ = 0, 1.M. j= 1,2. m 

/=! *=0 

There are m{M + 1) equations for the m(M + 1) unknown parameters {/}[/:]}, i — 1 , ., t , m, 
k = 0, ..., M . The MMSE FSE can be found as a solution to this set of linear equations* In 
terms of practical issues, we should also make the following observations: 

■ When we have only finite length data to estimate the necessary statistics. 


■i'n-M In - £1 and Zi[n - k\ zJ[n - £] 

can be replaced by their time averages from the limited data collection* This is similar to the 
TSE design. 

■ Also similar to the MMSE TSE design, different values of delay u will lead to different mean 
square errors. To find the optimum delay, we can evaluate the MSE for all possible delays 
u = 0, I, ..., M + L — 1 and choose the delay that results in the lowest MSE value. 

Since their first appearance," adaptive equalizers have often been implemented as FSE. 
When training data can be had, FSE has the advantage of suppressing timing phase sensitivity. 7 
Unlike the case in TSE, linear FSE does not necessarily amplify the channel noise. Indeed, the 
noise ampiification effect depends strongly on the coprime channel condition. In some cases, 
the subchannels in a set do not strictly share any common zero. However, there is at least one 
point z a that is almost the root of all the subchannels, that is, 

^0 i = 1. hi 

then we say that the subchannels are close to being singular. When the subchannels are coprime 
but are close to being singular, the noise amplification effect can still be quite severe* 


12.5 CHANNEL ESTIMATION 


Thus far, we have focused on the direct equalizer design approach in which the equalizer 
filter parameters are directly estimated from the channel input signal and the channel output 
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signals zi[n\. We should recognize that if MLSE receiver is implemented, the MLSE algorithm 
requires the knowledge of channel parameters {fifft]}. When exact channel knowledge is not 
available, the receiver must first complete the important first step of channel estimation. 

In channel estimation, it is most common to consider FIR channels of finite order L. 
Similar to the linear estimation of equalizer parameters introduced in the last section, channel 
estimation should first consider the channel input-output relationship 

z [nl = ^+ w[n] (12.5 5) 

k=0 


If consecutive pilot symbols {a’ Hj n — rt\, ri] + 1, . are transmitted, then because of the 

finite channel order L , the following channel output samples 

n = ft[ T Z/, ?ti -E Li 1, ..., H 2 } 


depend on these pilot data and noise only. We can apply the principle of MMSE to estimate 
the channel coefficients {/?[£]} to minimize the average estimation error: 


J(m, h[\l ..., h[L]) = 


1 

— n ] — -h 1 


E 


tti+L 


z[n] - 

Jt=0 


(12.56) 


This MMSE estimation can be simplified by setting to zero the derivative of the 
J(/z|Q] t /i|J_fi ’ ^ h[M]) with respect to each h\j\. Removing redundant constants, we have 


n 2 


E - E^u ■ E s «-* s U = 0 

/ k=Q y?i+Z / 


j = 0, 


Therefore, by dehning 


%[/l - E z t n Ki-j and Rsli’kl = ^ s n „ k s* n _j 

n\+L in+f- 


j = o. 1 . 

we can simplify the MMSE channel estimation into a compact matrix expression: 


'fiJO.O] rt ; [0,l] ■■■ R z [0, L]' 
**[1-03 tf.fl.ll ■■■ R t [UL] 


*-[£.,0] R t [L, 1] 



"A[0]“ 


' h,m " 


A[l] 

— 



h[L\_ 


K\M]_ 


(12.57) 


Eq. (12.57) can be solved by matrix inversion to estimate the channel parameters /t[i]. 

In the more general case of FSE, the same method can be used to estimate the ith subchannel 
parameters by simply replacing ?[« — k] with k; - kk 


12.6 DECISION FEEDBACK EQUALIZER 


The TSE and FSE we have discussed thus far are known as linear equalizers because the 
equalization consists of a linear filter followed by a memoryless decision device. These linear 
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Figure 12.8 

A decision 
feedback 
equalizer with 
fractionally 
spaced sompies 


equalizers are also known as feedforward (FFW) equalizers. The advantages of FFW equalizers 
lie in their simple implementation as FIR filters and in the straightforward design approaches 
they accommodate. FFW equalizers require much lower computational complexity than the 
nonlinear MLSE receivers. 

On the other hand, FFW equalizers do suffer from several major weaknesses. First, the 
TSE or FSE in their FFW forms can cause severe noise amplifications depending on the 
underlying channel conditions. Second, depending on the roots of the channel polynomials, 
the FFW equalizer(s) may need to be very long to be effective, particular when the channels 
are nearly singular. To achieve simple and effective channel equalization without risking noise 
amplification, a decision feedback equalizer (DFE) proves to be a very useful took 

Recall that FFW equalizers generally serve as a channel inverse filter (in ZF design) or a 
regularized channel inverse filter (in MMSE design). The DFE, however, comprises another 
feedback filter in addition to a feedforward filter. The feedforward filter is identical to linear 
TSE or FSE, whereas the feedback filter attempts to cancel ISI from previous data samples 
using data estimates generated by a memoryless decision device. The feedforward filter may 
be operating on fractionally spaced samples. Hence, there may be m parallel filters as shown 
in Fig. 12.8. 

The basic idea behind the inclusion of a feedback filter B{z) is motivated by awareness that 
the feedforward fitter output d[k] may contain some residual ISI that can be more effectively 
regenerated by the feedback filter output and canceled from v[k]. More specifically, consider 
the case in which the feedforward filter output d[k] consists of 



d[k] = sk-n + ^ C' t Sk-i + w[re] 


(12.58) 


residual ISI 


There is a residual ISI term and a noise term. If the decision output is very accurate such that 


s k—u — $k-u 

then the feedback filter input will equal to the actual data symbol. If we denote the feedback 
filter as 

iV-b 

B(z) = J2 biZ ~‘ 

i= 1 
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then we have 

N —u 

v[£] = dm - Y2 bth-u-i 

i=\ 

A ,r iV-h 

= -*A--h + CiS k -i-^bjSk-u-i - hwL«] 
f=H+l 1=1 

.V iV-H 

= Jjt-w + X! ^ r -' ~~ + w|«] 

l=M+l j=l 

A r -*f 

= Sk-u + (£•'«+/ -b,)s k - u -j + w[n] (12.59) 

i=l 

To eliminate the residual I SI, the feedback filter should have coefficients 

bi = c H +j i— 1,2, ..., N — u — 1 

With these matching DFE parameters, the residual ISI is completely canceled. Hence, the input 
to the decision device 


v[*] = Jjt- M + w[n] 

contains zero ISI. The only nuisance that remains in v[k] is the noise. Because the noise term 
in d[k] is not affected or amplified by the feedback filter, the decision output for the next time 
instant would be much more accurate after all residual ISI has been canceled. 

Our DFTE analysis so far has focused on the ideal operation of DFE when the decision 
results are correct. Traditionally, the design and analysis of DFE has often been based on such 
an idealized operating scenario. The design of DFE filters must include both the feedforward 
filters and the feedback filter. Although historically there have been a few earlier attempts to 
fully decouple the design of the feedforward filter and the feedback filter, the more recent work 
by Al Dhahir and Cioffi^ provides a comprehensive and rigorous discussion. 

In the analysis of a DFE, the assumption of cotrect decision output leads to the removal 
of ISI in v[k] y and hence, a better likelihood that the decision output is accurate. One cannot 
help but notice this circular “chicken or egg” argument The truth of the matter is that the 
DFE is inherently a nonlinear system. More importantly the hard decision device is not even 
differentiable. As a result, most traditional analytical tools developed for linear and nonlinear 
systems no longer apply. For this reason, the somewhat ironic chicken-egg analysis becomes 
the last resort. Fortunately, for high-SNR systems, this circular argument does yield analytical 
results that can be closely matched by experiments. 

Error Propagation in DFE 

Because of its feedback structure, the DFE does suffer from the particular phenomenon known 
as error propagation. For example, when the decision device makes an error, the erroneous 
symbol will be sent to the feedback filter and used for ISI cancellation in Eq. (12.59), How¬ 
ever, because the symbol is incorrect, instead of canceling the ISI caused by this symbol, the 
canceling subtraction may instead strengthen the ISI in v[fc]. As a result, the decision device is 
more likely to make another subsequent error, and so on. This is known as error propagation , 
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Error propagation means thattheactualDFEperfomumcewillbe worse than theprediciton 
of analytical results derived from the assumption of perfect decision- Moreover, the effect of 
error propagation means that DFE is more likely to make a short burst of decision errors before 
recovery from the eiTor propagation mode. The recovery time from error propagation depends 
on the channel response and was investigated by Kennedy and Anderson, 10 


12.7 OFDM (MULTICARRIER) COMMUNICATIONS 

As we have learned from the design of TSE and FSE, channel equalization is exclusively the 
task of the receivers. The only assistance provided by the transmitter to receiver equalization 
is the potential transmission of training or pilot symbols, in a typically uncertain environment, 
it makes sense for the receivers to undertake the task of equalization because the transmitter 
normally has little or no knowledge of the channel response it uses.* Still, despite their simpler 
implementation compared with the optimum MLSE, equalizers such as the feedforward and 
decision feedback types often lead to less than satisfactory performance. More importantly, the 
performance of the FFW and decision feedback equalizers is too sensitive to all the parameters 
in their transversal structure. If even one parameter fails to hold the desired value, an entire 
equalizer could crumble. 

In a number of applications, however, the transmitters have partial information regarding 
the channel characteristics. One of the most important piece of partial channel information is 
the channel delay spread; that is, for a finite length channel 

L 

ff(z) = £*[*]*"* 

k= 0 


the channel order L is known at the transmitter while {/;[&]} are still unknown* Given this partial 
channel information, the particular transmission technique known as orthogonal frequency 
division modulation (OFDM) can be implemented at the transmitter. With the application of 
OFDM, the task of receiver equalization is significantly simplified. 


12.7.1 Principles of OFDM 

Consider a transmitter that is in charge of transmitting a sequence of data signals {s*} over the 
FIR channel H(z) of order up to L. Before we begin to describe the fundamentals of OFDM, 
we note that the frequency response of the FIR channel can be represented as 

L 

H{^ J ) = J^h[k]e~ j27!fkT (12.60) 

*=0 


where T is the symbol duration and also the sampling period. Because is the 

frequency response of the channel /t[&] = h(kT ), it is a periodic function off with period l/T. 

The discrete Fourier transform (DFT) is a sampled function of the channel frequency 
response* Let N be the total number of uniform samples in each frequency period 1 /T. Then 


* In a stationary environment (e g,, DSL lines) the channels are quite stable, and the receivers can use a reverse link 
channel to inform the transmitter its forward channel information. This channel state information (CSI) feedback* 
typically performed at a low bit rate to ensure accuracy* can consume rather valuable bandwidth resources. 
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the frequency / is sampled at 


fo = 0-=0 

J NT 


/i =1 


i 


1 


NT NT 


/v-i — (N — \) 


1 (N — \) 


NT NT 

We can use a simpler notation to denote the DFT sequence by letting co }1 — InnfNT 
H[n] = 

L 

= ^ h\k]cxp(-jo) fl Tk ) 

Jt=o 
L 


= Y2 h \ k ^ x p(-j 27Z ^f kT ) 


k=() 

L 


— exp n = 0, 1, 


(tf-1) 


(12.61) 


From Eq. (12,61), it is useful to notice that // J is periodic with period TV (Fig. 12.9). Hence, 

L 


m-n\ = £>[*]exp (}2iz 

Jt=0 ^ 

V''j rj i {-n ^ ■'l Nk\ 

= J2 h[k ] exp \j2n — - j2tz — \ 

h —A ^ / 


(12.62a) 


k= o 
L 

= h[k] exp 

k=Q 

- H[N -n] 


-jin 


(TV - n)k 
N 


(12.62b) 

Bused on the linear convolutional relationship between the channel input [.?*} and output 


![*] = £>[i]j*-( + wr*i 


i =0 


Figure 12.9 

(a) Discrete time 
domain channel 
response and (b) 
its corresponding 
periodic DFT. 
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a vector of N output symbols can be written in matrix form as 


z[tf] 


A[ 0 ] AH] ■■■ h[L] 

z[N - 1 J 


AfOl A[ 1 ] ... h{L] 

z[L 1 

= 

A[ 0 ] A[ 1 ] h[L] 

z[lj 


A [01 A[ 1 ] ■.. A[L]_ 


S N 


w[JV] 

■SjV-1 


w[/V - 1J 

■Si 

+ 

w[L] 




: : 


w[lj 




The key step in OFDM is to introduce what is known as the cyclic prefix in the transmitted 
data.* This step replaces the M leading elements 


Sq, J-i, . . ., X-(L-l) 

of the (N -f /^dimensional data vector by the trailing symbols 

^Y-1t -SAf-L+l} - > f^O* S-U ■ ■ ■ , S-(L- 1 )} 

By inserting the cyclic prefix, we can then rewrite Eq. (12.63) as 


z[*] " 


- /t[ 0 ] A[l] ■■■ h[L\ 

z[N- 1] 


A[ 0 ] A[1 1 ■.. h[L\ 

z[L} 


A[ 0 ] A[l] ..■ h[L] 

z[l] 


A [01 A[ 1 ] h[L]_ 


SN 


w[AfJ 

$N- 1 


w[N - lj 

a\ 

+ 

w[L] 




'■ 


w[l] 

JN-L+ 1 _ 




(12 + 64a) 


Besides the use cyclic prefix, zero padding is an alternative but equivalent approach. 
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_ A[0] 

A[l] 


ALL] 

0 

O’ 





0 

A[0] 

A[l] 


A|LJ 

0 

■ ■ ■ h\L\ 




wLAT] 
wbV - 1] 

= 

0 


0 

ALOJ 

A[l] 


■*V-l 


wfL] 


h[L\ 



0 

A[0] 

ALL-1] 


S\ 


w[l] 


.AN 1 


h[L\ 

0 


0 h[0] 





K p : (N x N) 


(12.64b) 


The critical role of the cyclic prefix is to convert the convolution channel matrix in Eq. (12,64a) 
into a well-structured N x N cyclic matrix K cp in Eq, (12,64b). 

Next, we need to introduce the Appoint DFT matrix and the corresponding inverse DFT 
matrix. First, it is more convenient to denote 


W » = exp 

This complex number has some useful properties: 

• H# = 1 
■ W7 = w#- f 

If we take the DFT of the AT-dimenxional vector 

vo 

vi 

v = 

_VN- 1 _ 

then we have the DFT 

tf-i / nk\ 

V[n] = y] V/ exp f -y'2jr ^ J = ^ VjWff n = 0,1.(JV - 1) 

i =0 ' ' *=o 

and 

A-l , nk \ A-l 

V[-n] = ^ V( exp Iy2jr^- ) = vtWff nk n = 0, 1. (N - 1) 

k=Q ' ^ jt=0 

The inverse DFT can also be simplified as 

1 ^ _1 / nk\ 1 A_1 

= TJ Y1 V\n]txp(j2x~) = - y] V[n]Wx nk k = 0, 1. (N - 1) 

n=0 \ / y A _ n 
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Thus, the /V-point DFT of v can be written in the matrix form 


V - 


VLO] 


" 

<' ■ 

vv 0 '^- 15 1 

vilj 

— 

w™ 


w*; lN - l) 

- 1]_ 


u/t'V-D.fl 

L vr A' r 

<-‘J- 1 ■ 

.. vy^-lVoV-l)^ 


(12.65) 


If we denote the N x N DFT matrix as 


W N ± 


1 1 

1 wl 


1 W; 


CiV-u 




W ; 


1 

(A ,r — 1) 


w, 


(N—ir 


N 


(12.66a) 


then Wn also has an inverse 


W~ l = - 
* N 


I 1 
1 WZ 


1 W N 


(N-l) 


IV, 


IV, 


1 

-tJV-D 


-(N-IV 


(\2Mb) 


This can be verified (Prob. 12.7-1) by showing that 

w N • IV-' = I NxN 

Given this notation, we have the relationship of 

V = W N ■ v 

V = Wv 1 ■ V 

An amazing property of the cyclic matrix H cp can be established by applying the DFT and 
IDFT matrices. 


Hq, ■ IV 


"MO] 

Ml] 


m 

0 ■ ■■ 
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0 

MOJ 
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h[L] 

0 
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MO] 

Ml] ■■■ 

h[L] 
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... 0 
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1 

1 

w - s 

./v 

... 

-tA'-l) 

.. U/-W-D 2 



- //[Oj //[-l] 

H[-N + 1J 


1 

W[0] //L-lJWv 1 

H[-N+ l]M/gr (iV '" n 


N 

* ; 

; 



_ //[o] //[-nw A 7 w_11 ■■■ 

H[-N + y\w~ (N ~' m ~ [} _ 



"1 1 ■■■ 1 


"«[0] 

1 

J ; 

1 W^' ... 


Wl-lj 


N 

; 1 ; 





_i w^~ l) ... w- (A '- 1,2 _ 


Hi- 

N + 1J_ 


= (12.67a) 

where we have defined the diagonal matrix with the channel DFT entries as 


D h 




~H[Nl 

m -n 

— 

H[N — 1] 

H[-N + 1]_ 




The last equality follows from the periodic nature of H[n] given in Eq. (12.62b), We leave it as 
homework to show that any cyclic matrix of size N xN can be diagonalized by premulti plication 
with W/v and postmultiplication with (Prob. 12.7-2). 

Based on Eq. (12.67a) we have established the following very important relationship for 
OFDM: 


H lv = WJ} x -d h -w n 

(12 - 67b) 

Recall that after the cyclic prefix has been added, the channel input-output relationship is 
reduced to Eq + (12<64b) + As a result, 


zM 

z[N - 1] 

/ 1 v 1 

/I \ , 

■ Sflf 
•W-l j 


w[N] 
w[tf- 1] 

zl 1] 



1 

, *1 

+ 

wLlj 


This means that if w r e put the information source data into 
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then we can obtain the OFDM transmission symbols via 


" *,v 


" Sn 

Y'V-1 

-(4=w N y' 

S s \-\ 

- S \ 

U i 

h 


Despite the minor scalar 1/yW, we can call the matrix iransfonnation of the IDFT 

(inverse DFT) operation. In other words, we apply IDFT on the information source data £ at 
the OFDM transmitter to obtain s before adding the cyclic prefix. 

Similarly, we can also transform the channel output vector via 


zLJVJ 


zLJVJ 

z[N - Ij 


z[N - 1] 

z[\] 

= (vV^'7 

an 


Corresponding to the IDFT, this operation can also be named the DFT: Finally, we note that 
the noise vector at the channel output also undergoes the DFT: 




wf/Vl 

W|W - 11 ! 


w[JV - 11 

w[I] 


w[!J 


We now can see the very simple relationship between the source data and the channel output 
vector. which has undergone the DFT: 


z = Dhs + w (12.68a) 

Because Dh is diagonal, this matrix product is essentially element-wise multiplication: 

z[n\ = H[n\ s„ H- w[rc] n = 1. N (12.68b) 


This shows that we now equivalently have N parallel (sub)chaimels, each of which is just a 
scalar channel with gain H[ri\. Each vector oiN data symbols in OFDM transmission is known 
as an OFDM frame or an OFDM symbol. Each subchannel H[n\ is also known as a subcarrier. 

Thus, by applying the IDFT on the source data vector and the DFT on the channel output 
vector, OFDM converts an ISI channel of order L into N parallel subchannels without ISI. We 
no longer have to deal with the complex convolution that involves the time domain channel 
response. Instead, every subchannel is a non-frequency-selective gain only. There is no ISI 
within each subchannel* The N parallel subchannels are independent of one another because 
their noises are independent. This is why such a modulation is knowm as orthogonal frequency 
division modulation (OFDM). The block diagram of an TV-point OFDM system implementation 
with a linear FIR channel of order L is given in Fig* 12.10. 

12.7.2 OFDM Channel Noise 

According to Eq. (12*68b), each of the N channels acts like a separate carrier of frequency 
/ = n/NT with channel gain Effectively, the original data symbols [s n ] are split into N 
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Figure 12.10 

Illustration of on 
N-point OFDM 
transmission 
system. 


Figure 12.11 
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sequences and transmitted over AT subcarriers. For this apparent reason, OFDM is also com¬ 
monly known as a multicarrier communication system. Simply put, OFDM utilizes IDFT and 
cyclic prefix to effectively achieve multicarrier communications without the need to actually 
generate and modulate multiple (sub)carriers. The effective block diagram of OFDM appears 
in Fig. 12 + 11 + 

Now we can study the relationship between the transformed noise samples wfrc] in 
Fig, 12.11. First, notice that 


;v-i 

w[iV -;■] = £ W k N iN ~ } \\N - k] 
k=i) 

N — 1 

= £ W^V[N - k] j = 0, 1. (N - 1) 

k=0 

They are linear combinations of jointly distributed Gaussian noise samples {w|7V - £]}, 
Therefore, {w[N — j]} remains Gaussian. In addition, because w[rc] has zero mean, we have 

Af-l _____ 

w[JV -j] = £ Wu k ' J w[JV ~k]= 0, j = 0, 1.(JV - 1) 

Jt=0 

J A ,r 2' 1 

w[N - i] w[JV - j]* = — ^2 W N kl <v/ \- N “ X! wj?' J w[N - kz]* 

k\ =0 kj=0 



700 DIGITAL COMMUNICATIONS UNDER LINEARLY DiSTORTIVE CHANNELS 


N ~ 1 JV-1 

N ^ ^ -4a]* 

v *,=OJb=0 


A'-nv-i 


*]=0*2=0 
A r -1 

= A'/2A r ^ ty* l( ' M) 

*j-0 


= A-/2«.j" 

= T a[f - J1 


i =j 

i /y 


(12.69) 


Because {w[/i]} are zero mean with zero correlation, they are uncorrelated according to 
Eq. (12.69). Moreover, {w[«J} are also Gaussian noises. Since uncorrelated Gaussian random 
variables are also independent, {w[/ij| are independent Gaussian noises with zero mean and 
identical variance of ftf/2, The independence of the N channel noises demonstrates that OFDM 
converts an FIR channel with TSI and order up to L into N parallel, independent, and AWGN 
channels as shown in Fig. 12.11. 


12.7.3 Zero-Padded OFDM 

We have shown that by introducing a cyclic prefix of length L , a circular convolution channel 
matrix can be established. Because any circular matrix of size N x N can be diagonalized by 
IDFT and DFT (Prob. 12.7-2), the IS1 channel of order less than or equal to L is transformed 
into N parallel independent subchannels. 

There is also an alternative approach to the use of cyclic prefix. This method is known 
as zero padding. The transmitter first perform an IDFT on the N input data. Then, instead of 
repeating the last L symbols as in Eq. (l2 + 64b) to transmit 

J/V-l 

n 

s jV 


L^-^-hi J 

we can simply replace the cyclic prefix with L zeros and transmit 

$N -I I 


0 

0 


(N + L)xl 
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The rest of the OFDM transmission steps remain unchanged. At the receiver end, we can slack 
up the received symbols in 




" 0 ' 

-[TV - 11 



zlij 

+ 

0 

; z[N + L] 

an 


_z[N + 1J_ 


We then can show (Prob. 12,7-4) that 

would achieve the same multichannel relationship of Eq. (12,68b). 


(12.70) 


12.7.4 Cyclic Prefix Redundancy in OFDM 

The two critical steps of OFDM at the transmitter arc the insertion of the cyclic prefix and the 
use of iV- point ID FT, The necessary length of cyclic prefix L depends on the order of the FIR 
channel. Since the channel order may vary in practical systems, the OFDM transmitter must 
be aware of the maximum channel order information a priori. 

Although it is acceptable for OFDM transmitters to use an overestimated channel order, the 
major disadvantage of inserting a longer-than-necessary cyclic prefix is the waste of channel 
bandwidth. To understand this drawback, notice that in OFDM, the cyclic prefix makes possible 
the successful transmission of N data symbols {s i, ..., s^} with time duration (N -b L)T. The 
L cyclic prefix symbols are introduced by OFDM as redundancy to remove the IS I in the 
original frequency-selective channel H(z)* Because (A + L) symbol periods are now being 
used to transmit the N information data, the effective data rate of OFDM equals 

N 1 

N + LT 

If L is overestimated, the effective data rate is reduced, and the transmission of the unnecessarily 
long cyclic prefix wastes some channel bandwidth. For this reason, OFDM transmitters require 
accurate knowledge about the channel delay spread to achieve good bandwidth efficiency. If 
the cyclic prefix is shorter than L , then the receiver is required to include a time domain 
filter known as the channel-shortening filter to reduce the effective channel-filter response to 
within LT< 

12.7.5 OFDM Equalization 

We have shown that OFDM converts an ISI channel into N parallel AWGN subchannels as 
shown in Fig, 1241. Each of the N subchannels has an additive white Gaussian noise of 
zero mean and variance A r /2. The subchannel gain equals H[k J, which is the FIR frequency 
response at k/NT Hz. Strictly speaking, these N parallel channels do not have any ISI. Hence, 
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Figure 12.12 

Using a bank of 
receiver gain 
adjustors for N 
independent 
AWGN channels 
in OFDM to 
achieve gain 
equalization. 



channel equalization is not necessary. However, because each subchannel has a different gain, 
the optimum detection of {s n \ from 


z[n\ = #[«]£„ n = 1, ... , N 
would require knowledge of the channel gain H[n] 

= dec n = ), ..., A 

This resulting OFDM receiver is shown in Fig. 12.12. For each subchannel, a one-tap 
gain adjustment can be applied to compensate the subchannel scaling. In fact, this means that 
we need to implement a bank of N gain adjustment taps. The objective is to compensate the 
N subchannels such that the total gain of each data symbol equals unity before the QAM 
decision device. In fact, the gain equalizers scale both the subchannel signal and the noise 
equally. They do not change the subchannel SNR and do not change the detection accuracy. 
Indeed, equalizers are used only to facilitate the use of the same modular decision device on all 
subchannels. Oddly enough, this bank of gain elements at the receiver is exactly the same as 
the equalizer in a high-fidelity audio amplifier. This structure is known henceforth as a one-tap 
equalizer for OFDM receivers. 


12.8 DISCRETE MULTITONE (DMT) MODULATIONS 

A slightly different form of OFDM is called discrete multi tone (DMT) modulation. In DMT, 
the basic signal processing operations are essentially identical to OFDM. The only difference 
between DMT and a standard OFDM is that DMT transmitters are given knowledge of the 
subchannel information. As a result, DMT transmits signals of differing constellations on 
different subchannels (known as subcarriers). As shown in Fig. 12.13, the single RF channel is 
split into N subchannels or subcarriers by OFDM or DMT. Each subcarrier conveys a distinct 
data sequence: 


{-♦ Uk+ 1] Silk] silk-1] ■■■} 

The QAM constellations of the N sequences can often be different. 

Because the original channel distortion is frequency selective, subchannel gains are gen¬ 
erally different across the bandwidth. Thus, even though DMT or OFDM converts the channel 
with ISI distortion into N parallel independent channels without IS1, symbols transmitted over 
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Figure 13*13 

DMT trans¬ 
mission of N 
different symbol 
streams over a 
single FIR 
channel. 



different subcarriers will encounter different SNRs at the receiver end. In DMT, the receivers 
are responsible for conveying to the transmitter all the subchannel information. As a result, the 
transmitter can implement compensatory measures to optimize various performance metrics. 
We mention two common approaches adopted at DMT transmitters; 

■ Subcarrier power loading to maximize average receiver SNR. 

‘ Subcarrier bit loading to equalize the bit error rate (BER) across subcarriers. 

Transmitter Power Loading for Maximizing Receiver SNR 

To describe the idea of power loading at the transmitter for maximizing total receiver SNR, 
let Si[k] be the data stream carried by the ith subchannel and call {*y[fcl} an independent data 
sequence in time k. Let us further say that all data sequences {$;[£]} are also independent of 
one another Let the average power of s;[k] be 


Pi = Mk]\ 2 


The total channel input power is 


M 

z> 

i=\ 

whereas the corresponding channel output power at the receiver equals 

M 

£™ a -* 

f= l 


Hence, the total channel output SNR is 

Sfli \H[i'l \ 2 ■ Pi 
AW/2 


2 

77n 


M 




1=1 
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To determine the optimum power distribution, we would like to maximize the output SNR. 
Because the channel input power is limited, the optimization requires 

:V 

max V|W[(]| 2 - Pi (12.71) 

{Pf> 01 fr' 

J—1 

N 

subject to ^ Pj — P 

i= l 


Once again, wc can invoke the Cauchy’Schwartz inequality 


Y2 a,b ' 




with equality if and only if h l = kctf 
Based on the Cauchy-Schwartz inequality. 


A r 


N jV 


max 

(A>0) 


E 


\ 


j =1 i= 1 


if 


Pi=MH[i}\ 2 


(12.72a) 


(12.72b) 


Because of the input power constraint Yif-\ A = P> the optimum input pow'er distribution 
should be 


In other words, 


A ,r N 


E^-^-E \ h w\ 2 = p 

i =i f=i 


1 


(12.73a) 


(12.73b) 


Substituting Eq. (12.73b) into Eq. (12.72b), we can obtain the optimum channel input power 
loading across the N subchannels as 


Pi = 


mmi 2 p 

Ef=i im 2 


(12.74) 


This optimum distribution of powder in OFDM, also known as pow ? er loading, makes very 
good sense. When a channel has high gain, it is able to boost the pow'er of its input much more 
effectively than a channel with low' gain. Hence, the high-gain subchannels will be receiving 
higher power loading, while low-gain subchannels will receive much less. No power should 
be wasted on the extreme case of a subchannel that has zero gain, since the output of such a 
subchannel will make no power contribution to the total received signal power. 

In addition to the perspective of maximizing average SNR, information theory can also 
rigorously prove the optimality of power loading (known as water pouring) in maximizing the 
capacity of frequency-selective channels. This discussion will be presented later (Sec. 13.7). 



12.8 Discrete Multitone [DMT] Modulations 705 


Subcarrier Bit Loading in DMT 

If the transmitter has obtained the channel information |//[(]|, it then becomes possible for 
the transmitter to predict the detection error probability on the symbols transmitted over each 
subcarrier. The SNR of each subcarrier is 


SNR; 


2|HLiJ| 2 

A' 


l *[*]| 2 


Therefore, theBER on this particular subcarrier depends on the SNR and the QAM constellation 
oi the s ubcarrier. Different modulations at different subcarriers can lead to different powers 

i*r*]i 2 . 

Consider the general case in which the ith subchannel carries K, bits in each modulated 
symbol. Furthermore, we denote the BER of the ith subchannel by P b [(\, Then the average 
receiver bit error rate across the N subcarriers is 


p E;=i a, ■ pm 

<Li= \ 

If all subchannels apply the same QAM constellation, then AT, is constant for all * and 


Pi =1^11] 

i=\ 

Clearly, subchannels with a very weak SNR will generate many detection errors, while sub’ 
channels wdth a strong SNR will generate very few detection errors. If there is no power 
loading, then the ith subchannel SNR is proportional to the subchannel gain |tf|7]| 2 . In other 
words, BERs of poor subchannels can be larger than the BERs of good subchannels by several 
orders of magnitude. Hence, the average BER P& will be dominated by those large Ph[i \ from 
poor subchannels. Based on this observation, we can see that to reduce the overall average 
BER, it is desirable to ‘'equalize'’ the subchannel BER. By making each subchannel equally 
reliable, the average BER of the DMT system will improve. One effective way to ^equalize” 
subchannel BER is to apply the practice of bit loading, 11 ’ n 

To describe the concept of bit loading, Table 12,1 illustrates the SNR necessary to achieve 
a detection error probability of 1(T 6 for live familiar constellations. It is clear that small 
constellations (e.g„ BSPK, QPSK) require much lower SNRs than large constellations (e.g., 
16-QAM, 32-QAM). This means that subcarriers with low gains should be assigned less com¬ 
plex constellations and should carry fewer bits per symbol. In the extreme case of subchannels 
w'ith gains close to zero, no bit should be assigned and the subcarriers should be kept vacant. 


TABLE 12.1 

SNR Required to Achieve Detection 
Error Probability of 10 -6 


Constellation 

E h jN at P e = icr 6 , dB 

BPSK 

10.6 

QPSK 

10,6 

8-PSK 

14 

16-QAM 

14.5 

32-QAM 

17.4 
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On the other hand, subearriers with large gains should be assigned more complex constellations 
and should carry many more bits in each symbol. This distribution of bits at the transmitter 
according to subcarrier conditions is called bit loading. In some cases, a subearrier gain may be 
a little too low to carry n bits per symbol but too wasteful to carry ti—1 bits per symbol. In such 
cases, the transmitter can apply additional power loading to this subcarrier. Therefore, DMT 
bit loading and power loading are often complementary at the transmitter. 5 ]i 12 Figure 12.14 
is a simple block diagram of Lhe highly effective DMT bit-and-power loading. 

Cyclic Prefix and Channel Shortening 

The principles of OFDM and DMT require that the cyclic prefix be no shorter than the order 
of the FIR communication channel response. Although this requirement may be reasonable in 
a well-defined environment, for many applications, channel order or delay spread may have 
a large variable range. If a long cyclic prefix is always provisioned to target the worst-case 
(large) delay spread, then the overall bandwidth efficiency of the OFDM/DMT communication 
systems will be very low. 

To overcome this problem, it is more desirable to apply an additional time domain equalizer 
(TEQ) at the receiver end to shorten the effective channel order. We note that the objective of 
this time domain equalizer (TEQ) is not to fully eliminate the ISI as in Sec. 13.3. Instead, the 
purpose of TEQ filter Gteq(z) t0 shorten the effective order of the combined response of 
channel equalizer such that 


U 

L\ < L 

k= 0 


This channel-shortening task is less demanding than full TSI removal. By forcing L\ to be 
(approximately) smaller than the original order T, a shorter cyclic prefix can be used to improve 
the OFDM/DMT transmission efficiency. The inclusion of a TEQ for channel shortening is 
illustrated in Fig. 12.15. 


Figure 12,14 

Bit and power 
loading in a 
DMT (OFDM) 
transmission 
system with N 
subcarriers. 



Figure 12.15 

Time domain 
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for channel 
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12.9 REAL-LIFE APPLICATIONS OF OFDM AND DMT 

OFDM is arguably oneot the most successlul signaling techniques for digital communications. 
Combined with transmitter power loading and bit loading, the benefits of OFDM include high 
spectral efficiency and resiliency against RF interferences and multipath distortion. As a result 
ol the many advantages, there arc a number of practical OFDM/DMT communication systems 
ranging from the wire-line digital subscriber line (DSL) system to the wireless ultrawideband 
(UWB) radio as well as satellite broadcasting. 

Asymmetric Digital Subscriber Line (ADSL) 

In the past tew years, ADSL has replaced a vast majority of voice modems to become the 
dominant technology providing internet service to millions of homes. Conventional voice 
band modems use up to 3.4 kHz of analog bandwidth sampled at S kHz by the public switched 
telephone network (PSTN). These dial-up modems convert bits into waveforms that must 
fit into this tiny voice band. Because of the very small bandwidth, voice band modems are 
forced to apply very' large QAM constellation (e.g., 960-QAM in V.34 for 28.8kbit/s). Large 
QAM constellation require very high transmission power and high complexity equalization. 
For these reasons, voice band modems quickly hit a rate plateau at 56kbit/s in the ITU-T V.90 
recommendation. ^ 

ADSL, on the other hand, is not limited by the telephone voice band. In fact, ADSL com¬ 
pletely bypasses the voice telephone systems by specializing in data service. It relies on the 
traditional twisted pair of copper phone lines to provide the last-mile connection to individual 
homes. The main idea is that the copper wire channels in fact have bandwidth much larger 
than the 4 kHz voice band. However, as distance increases, the copper wire channel degrades 
rapidly at higher frequency. Hence, DSL can exploit the large telephone wire bandwidth 
(up to 1 MHz) only when the connection distance is short (1-5 km), 14 

The voice band is sometimes known as the plain-old-telephone-service (POTS) band. 
POTS and DSL data serv ice are separated in frequency. The voice traffic continues to use the 
voice band below 3.4 kHz. DSL data uses the frequency band above the voice band. As shown 
in Fig. 12.16, the separation of the two signals is achieved by a simple (in-line) low-pass filter 
inserted between the phone outlet and each telephone unit when DSL service is available. 

Figure 12.17 illustrates the bandwidth and subcarrier allocation of the ADSL system. From 
the top of the POTS band to the nominal ADSL upper limit of 1104 kHz, we have 255 equally 
spaced subchannels (subcarriers) of bandwidth 4,3175 kHz. These subcarriers are labeled 1 to 
255. The lower number subcarriers, between 4.3175 and 25.875 kHz, may also be optionally 
used by some service providers. In typical cases, however, ADSL service providers utilize the 


Figure 12.16 
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Figure 12.17 

Frequency and 
subcarrier 
allocation in 
ADSL services. 



TABLE 12.2 

Basic ADSL Upstream and Downstream Subcarrier Allocations and Data Rates 



Upstream 

Downstream 

Modulation (bit loading) 

DMT frame transmission rate 

QPSK to 64-QAM (2-6 bits per symbol) 

4 kHz 

Pilot subcarrier 

No. 64 

No. 96 

Typical subcarriers 

6 to 32 

33 to 255 

Typical bits per frame 

Up to 162 bits 

Up to 1326 bits 

Maximum possible subcarriers 

1 to 63 

1 to 255 (excluding 64 and 96) 

Maximum bits per frame 

Up to 378 bits 

Up to 1518 bits 

Maximum data rate 

4 kHz x378 = 

1,512 Mbit/s 4 kHz x 1518 bits = 6.072 Mbit/s 


nominal band of 25,875 to 1104 kHz (subcarrier 6 to subcarrier 255). These 250 available 
subcarriers are divided between downstream data transmission (from DSL server to homes) 
and upstream data (from homes to DSL server). 

In today's internet applications, most individual consumers have a higher downstream 
need than upstream. Unlike business users, these “asymmetric” data service requirements 
define the objective of ADSL. Therefore in ADSL, the number of downstream subcarriers is 
greater than the number of upstream subcarriers. In ADSL, subcarriers 6 to 32 (corresponding 
to 25.875-138 kHz) are generally allocated for upstream data. Subcarrier 64 and subcarrier 
96 are reserved for upstream pilot and downstream pilot, respectively. Excluding the two 
pilot subcarriers, subcarriers 33 to 255 (corresponding to 138-1104 kHz) are allocated for 
downstream data. The typical carrier allocation and data rates are summarized in Table 12.2. 
Notice that this table applies only to the basic DSL recommendations by ITU-T (G.992,1). 
Depending on the channel condition, various service providers may choose to increase the 
data rate by using higher bandw idth and even more subcarriers above subcarrier 255. 

In ADSL, the DMT frame transmission rate is 4 kHz. Upstream DMT utilizes 64-point real¬ 
valued IFFTthat is equivalent to 32-point complex IFFT. The upstream cyclic prefix has length 
4. On downstream, 512 real-valued IFFT is applied, equivalent to 256-point complex IFFT. 
The downstream cyclic prefix has length 32 (equivalent to 16 complex numbers). Because 
the channel delay spread is usually larger than the prescribed cyclic prefix, TEQ channel 
shortening is commonly applied in ADSL with the help of several thousand training symbols 
(e.g., in downstream) to adapt the TEQ parameters. 
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Digital Broadcasting 

Although North America has decided to adopt the ATSC standard for digital television broad¬ 
casting at the maximum rate of 19.39 Mbit/s using 8-VSB modulation, DVB-T (digital video 
broadcasting—terrestrial) has become a pan-European standard, also gaining acceptance in 
parts of Asia, Latin America, and Australia. DVB-T was first introduced in 1997, 15 utilizing 
OFDM over channels 6, 7, or 8 MHz wide, 

DVB-T specifies three different OFDM transmission modes with increasing complexity 
for different target bit rates (video quality). It can use 2048 subcarriers (2K mode), 4096 
subcairiers (4k mode), and 8196 subcarriers (8k mode). The cyclic prefix length may be 1/32, 
1/16, 1/8, or 1/4 of the FFT length in the three different modes. Each subcarrier can have 
three modulation formats: QPSK, 16-QAM, or 64-QAM. When subchannel quality is poor, 
a simpler constellation such as QPSK is used. When subchannel SNR is high, the 64-QAM 
constellation is used. Different quality channels will bring about different video quality from 
standard-definition TV (SDTV) to high-definition TV (HDTV), 

The DVB-H standard for mobile video reception by handheld mobile phones was published 
in 2004, The OFDM and QAM subcarrier modulation formats remain identical to those for 
DVB-T For lower video quality multimedia services, digital multimedia broadcasting (DMB) 
also applies OFDM but limits itself to (differential) QPSK subcarrier modulation. Occupying 
less than 1,7 MHz bandwidth, DMB can use as many as 1536 subcarriers. 


Broad OFDM Applications 

DSL and DVB-T are only two limited applications of OFDM in digital communication 
systems. Overall, OFDM has found broad applications in numerous terrestrial wireless com¬ 
munication systems. An impressive list includes digital audio broadcasting (DAB), Wi-Fi 
(IEEE 802.11a, IEEE 802,ilg), WiMAX (IEEE 802.16), ultrawideband (UWB) radio (IEEE 
802.15.3a), 3rd Generation Partnership Project (3GPP) long-term-evolution (LTE), and high¬ 
speed OFDM packet access (HSOPA). Table 12.3 provides a snapshot of the important roles 
played by OFDM in various communication systems. 

It is noticeable, however, that OFDM has not been very popular in satellite communications 
using directional antennas and in coaxial cable systems (e.g., cable modems, cable DTV), The 
reason is in fact quite obvious. Directional satellite channels and coaxial cable channels have 
very little frequency-selective distortion. In particular, they normally do not suffer from serious 
multipath effects. Without having to combat significant channel delay spread and 1SI, OFDM 
would in fact be redundant. This is why systems such as digital satellite dish TV services and 
cable digital services all prefer the basic single-carrier modulation format. Direct broadcasting 
and terrestrial applications, on the other hand, often encounter multipath distortions and are 
perfect candidates for OFDM. 


Digital Audio Broadcasting 

As listed in Table 12.3, the European project Eureka 147 successfully launched OFDM dig¬ 
ital audio broadcasting (DAB), Eureka 147 covers both terrestrial digital audio broadcasting 
and direct satellite audio broadcasting without directional receiving antennas. Receivers are 
equipped only with traditional omnidirectional antennas. Eureka 147 requires opening a new 
spectral band of 1,452 to 1.492 MHz in the L-band for both terrestrial and satellite broadcasting. 

Despite the success of Eureka in Europe, however, concerns about spectral conflict in 
the L-band led the United States to decide against using Eureka 147. Instead, DAB in North 
America has split into satellite radio broadcasting by XM and Sirius, relying on proprietary 
technologies on the one hand and terrestrial broadcasting using the IBOC (in-band, on-channel) 
standard recommended by the FCC on the other. XM and Sirius competed as two separate 
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TABLE 12.3 

A Short but Impressive History of OFDM Applications 


Year Events 


1995 Digital audio broadcasting standard Eureka 147: first OFDM standard 

1996 ADSL standard ANSI TL413 (later became ITU G992.1) 

1997 DVB-T standard defined by ETSI 

1998 Magic WAND project demonstrates OFDM modems for wireless LAN 

1999 IEEE 802,1 la wireless LAN standard (Wi-Fi) 

2002 IEEE 802J Ig standard for wireless LAN 

2004 IEEE 802, 16d standard for wireless MAN (WiMAX) 

2004 MediaFLO announced by Qualcomm 
2004 ETSI DVB-H standard 

2004 Candidate for IEEE 802.15.3a (UWB) standard MB-OFDM 

2004 Candidate for IEEE 802.1 In standard for next-generation wireless LAN 

2005 IEEE 802,16e (improved) standard for WiMAX 

2005 Terrestrial DMB (T-DMB) standard (TS 102 427) adopted by ETSI (inly) 
2005 First T-DMB broadcast began in South Korea (December) 

2005 Candidate for 3,75G mobile cellular standards (LTE and HSOPA) 

2005 Candidate for Cf K (China, Japan, Korea) 4G standard collaboration 

2005 Candidate for IEEE P1675 standard for power line communications 

2006 Candidate for IEEE 802.16m mobile WiMAX 


companies before completing their merger in 2008. The new company, Sirius XM, serves 
satellite car radios, while IBOC targets traditional home radio customers. Sirius XM uses the 
2.3 GHz S-band for direct satellite broadcasting. Under the commercial name of HD Radio 
developed by iBiquity Digital Corporation, IBOC allows analog FM and AM stations to use 
the same band to broadcast their content digitally by exploiting the gap between traditional 
AM and FM radio stations. By October 2008, over 1.5 million HD radio chipsets have been 
shipped and there were more than 1800 HD Radio Stations in the United States alone. 

In satellite radio operation, XM radio uses the bandwidth of 2332,5 to 2345,0 MHz. 
This 12.5 MHz band is split into six carriers. Four carriers are used for satellite transmis¬ 
sion. XM radio uses two geostationary satellites to transmit identical program content. The 
signals are transmitted with QPSK modulation from each satellite. For reliable reception, the 
line-of-sight signals transmitted from satellite 1 are received, reformatted to multicarrier mod¬ 
ulation (OFDM), and rebroadcast by terrestrial repeaters. Each two-carrier group broadcasts 
100 streams of 8 kbit/s. These streams represent compressed audio data. They are combined by 
means of a patented process to form a variable number of channels using a variety of bit rates. 

Sirius satellite radio, on the other hand, uses three orbiting satellites over the frequency 
band of 2320 to 2332 MHz. These satellite are in lower orbit and are not geostationary. In fact, 
they follow a highly inclined elliptical Earth orbit (HEO), also known as the Tundra orbit. Each 
satellite completes one orbit in 24 hours and is therefore said to be geosynchronous. At any 
given time, two of the three satellites will cover North America. Thus, the 12 MHz bandwidth 
is equally divided among three carriers: two for the two satellites in coverage and one for 
terrestrial repeaters. The highly reliable QPSK modulation is adopted for Sirius transmission. 
Terrestrial repeaters are useful in some urban areas where satellite coverage may be blocked. 
For terrestrial HD radio systems, OFDM is also key modulation technology in IBOC 
for both AM IBOC and the FM IBOC. Unlike satellite DAB, which bundles multiple station 
programs into a single data stream, AM IBOC and FM IBOC allow each station to use its 
own spectral allocation to broadcast, just like a traditional radio station. FM IBOC has broader 
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bandwidth per station and provides a higher data rate. With OFDM, the FM IBOC subchannel 
bandwidth equals 363,4 Hz, and the maximum number of subcarriers is 1093. Each subcarrier 
uses QPSK modulation. On the other hand, the AM IBOC subchannel bandwidth is 181.7 
Hz (half as wide), and as many as 104 subearriers may be used. Each subcairier can apply 
16-point QAM (secondary subearriers) or 64 point QAM (primary' subcarriers). Further details 
on TBOC can be found in the book by Maxson. 16 

12.10 BLIND EQUALIZATION AND IDENTIFICATION 

Standard channel equalization and identification at receivers typically require a known (train¬ 
ing) signal transmitted by the transmitter to assist in system identification* Alternatively, 
the training sequence can be used directly to determine the necessary channel equalizer. 
Figure 12.18 illustrates how a training signal can be used in the initial setup phase of the receiver. 

During the training phase, a known sequence is transmitted by the transmitter such that 
the equalizer output can be compared whth the desired input to form an error. The equalizer 
parameters can be adjusted to minimize the mean square symbol error. At the end of the training 
phase, the equalizer parameters should be near enough to their optimum values that much of 
the intersymbol interference (1SI) is removed Now that the channel input can be correctly 
recovered from the equalizer output through a memoryless decision device (sheer), real data 
transmission can begin. The decision output s[k — u] can be used as the correct channel input to 
form the symbol error for continued equalizer adjustment or to track slow' channel variations. 
The adaptive equalizer then obtains its reference signal from the decision output when the 
equalization system is switched to the decision-directed mode (Fig. 12.18)* Tt is evident that this 
training mechanism can be applied regardless of the equalizer in use, be it TSE, FSE, or DFE* 

In many communications, signals are transmitted over time-varying channels. As a result, 
a periodic training signal is necessary to identify or equalize the time-varying channel response. 
The drawback of this approach is evident in many communication systems where the use of 
training sequence can represent significant overhead costs or may even be impractical. For 
instance, no training signal is available to receivers attempting to intercept enemy communi¬ 
cations, In a multicast or a broadcast system, it is highly undesirable for the transmitter to start 
a training session for each new receiver by temporarily suspending its normal transmission 
to all existing users* As a result, there is a strong and practical need for a special kind of 
channel equalizer, known as blind equalizers, that do not require the transmission of a training 
sequence* Digital cable TV and cable modems are excellent examples of such systems that can 
benefit from blind equalization. 

There are a number of different approaches to the problem of blind equalization. In general, 
blind equalization methods can be classified into direct and indirect approaches. In the direct 


Figure 12.18 
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blind equalization approach, equalizer filters are derived directly from input statistics and the 
observed output signal of the unknown channel. The indirect blind equalization approach first 
identifies the underlying channel impulse response before designing an appropriate equalizer 
filter or MLSE metrics. Understanding these subjects require in-depth reading of the literature, 
including papers from the 1980s by Benveniste et al., 17, 18 who pioneered the terminology 
‘‘blind equalizationAnother very helpful source of information can be found in the papers by 
Godard, 19 Pic chi and Prati, 20 Shalvi and Weinstein, 21,22 Rupprecht, 23 Kennedy and Ding. 24 
Tong etal., 25 Moulines et al., 26 and Bollinger and Rosenblatt. 28 For more systematic coverage, 
readers are referred to several published books on this topic . 6,281 29 


12.11 TIME-VARYING CHANNEL DISTORTIONS 
DUE TO MOBILITY 

Thus far, we have focused on channel distortions that are invariant in time, or invariant at least 
for the period of concern. In mobile wireless communications, user mobility naturally leads to 
channel variation. Two main causes lead to time-varying channels: (1) a change of surroundings 
and (2) the Doppler effect. In most cases, a change of surroundings for a given user takes place 
at a much slower rate than the Doppler effect. For example, a transmitter/receiver traveling at 
the speed of 100 km/h, moves less than 2.8 meters in 100 ms. However, for carrier frequency of 
900 MHz, the maximum corresponding Doppler frequency shift would be 83 Hz. This means 
that within 100 ms, the channel could have undergone 8 full cycles of change. Thus, unless the 
mobile unit suddenly turns a corner or enters a tunnel, the Doppler effect is usually far more 
severe than the effect of change in surroundings. 

Doppler Shifts and Fading Channels 

In mobile communications, the mobility of transmitters and receivers can lead to what is 
known as the Doppler effect, described by the nineteenth-century Austrian physicist Christian 
Doppler. He observed that the frequency of light and sound waves is affected by the relative 
motion of the source and the receiver. Radio waves experience the same Doppler effect when 
the transmitter or receiver is in motion. In the case of a narrowband RF transmission of a signal 

m(t) cos a> c t 

if the relative velocity of the distance change between the source and the receiver equals v d , 
then the received RF signal effectively has a new carrier 

v d 

m(t) cos(o; c 4- 0 } d )t o) d = — 

c 

where c is the speed of light. Note that v,j and hence wd are negative when the source-to-receiver 
distance decreases and positive when it increases. 

In the multipath environment, if the mobile user is traveling at a given speed v^, then the 
line-of-sight path has the highest variation rate. This means that if there are K + 1 multipaths 
in the channel, the zth propagation path distance would vary at the velocity of v,-. The ith signal 
copy traveling along the ith path should have a Doppler shift 

Wi = ~(t>c (12.75) 

c 


Moreover, because 


< V/ < 
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the maximum Doppler shift is bounded by 




Wd\ 


-oj c 


Based on the Doppler analysis, each path has a Doppler frequency shift a>i, delay r,, and 
path attenuation . The signal from the it h path can be written as 


5> ~ kT - n) 


L k 




cos [(o) c + c&i)(t - r,)l 


J2 lm {sk)p(t - kT - xi) 


L k 


sinl(w £ . + coMt - n)] 


(12.76) 


As a result, the baseband receiver signal after demodulation is now 


w = E 


Sk 


E* 


k L j~0 


E.“' ex P ')t/ 1 exp (-jcojt) p(t ~kT - n ) 

,=0 m 

K 


~kT - t,) 


(12.77) 


Frequency-Selective Fading Channel 

Recall that the original baseband transmission is 


*(0 = 


In the channel output of Eq, (12.77), if the mobile velocity is zero, then coj = Oand ft(t) = ft 
are constant. In the case of zero mobility, the baseband channel output simply becomes 


y(t) = E i- * 

k L/=0 


J^&pU-kT-n) 

-i =0 

This means that corresponding channel is linear time-invariant with impulse response 

K 

*(o = E A<S( ' _r ') ( 1178 > 

t=0 

and transfer function 

K 

H(J) = E^' ex P(-;2^/f/) (12,79) 


(=0 


This is a frequency-selective channel with intersymbol interference (ISI). 
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When the mobile speed is not zero, then ftt(t) are time-varying. As a result, the channel 
is no longer linear time-invariant. Instead, the channel is linear Lime-varying. Suppose the 
channel input is a pure sinusoid, Jt(f) = exp (ja> p t). The output of this time-varying channel 
according to Eq. (12.77) is 

K K 

y] ft(0 exp [ju) p (t - r,)] = expijcopt) ■ ^ ft (f) expi-jcopz,) (12.80) 
i=0 )'=0 

This relationship shows that the channel response to a sinusoidal input equals a sinusoid of 
the same frequency but with time-varying amplitude. Moreover, the time-varying amplitude 
of the channel output also depends on the input frequency ico p )> For these multipath channels, 
the channel response is time-varying and is frequency dependent , In wireless communica¬ 
tions, time-varying channels are known as fading channels. When the time-varying behaviors 
are dependent on frequency, the channels are known frequency-selective fading channels. 
Frequency-selective fading channels, which are characterized by time-varying 1S1, are major 
obstacles to wireless digital communications. 

Flat Fading Channels 

One special case to consider is when the multipath delays {r/} do not have a large spread. In 
other words, let us assume 


0 = r 0 < T [ < ■ - < t k 

If the multipath delay spread is small, then <$; T and 

I/ ^ 0 i = I, 2, >,,, K 


In this special case, because pit - r f ) pit), the received signal y(f) is simply 

K 1 

y(t) = Jjt ■ Vi exp [~j(co c H- exp (-.MO pti - kT - r,) \ 

k U=o J 

f K 1 

, ^2 e *P exp i-jojf) pit - kT) 


Sk 


i =0 


K ] 

exp exp i-j^t) 1 ^ s k p(t - kT) 

f=0 J k 


= p(t) ■ ^a k pit - kT) 


(12,81) 


where we have defined the time-varying channel gain as 


K 

pit) = ^of e expL-y(o> c . + a)i)Ti] exp (-M0 (12.82) 

i =o 

Therefore, when the multipath delay spread is small, the only distortion in the received 
signal y(0 is a time-varying gain pit). This time-variation of the received signal strength 
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is known as fading. Channels that exhibit only a time-varying gain that is dependent on the 
environment are known as fiat fading channels. Flat fading channels do not introduce any ISI 
and therefore do not require equalization. Instead, since flat fading channels generate output 
signals that have time-varying strength, periods of error-free detections tend to be followed 
by periods of error bursts. To overcome burst errors due to fiat fading channels, interleaving 
forward error correction codewords is an effective tool. 


Converting Frequency-Selective Fading Channels 
into Flat Fading Channels 

Fast fading frequency-selective channels pose serious challenges to mobile wireless communi¬ 
cations, On one hand, the channels introduce ISI* On the other hand, the channel characteristics 
are also time varying. Although the time domain equalization techniques described in Secs. 12.3 
to 12*6 can effectively mitigate the effect of 1ST they require training data to either identify 
the channel parameters or estimate equalizer parameters. Generally, parameter estimation of 
channels or equalizers cannot work well unless the parameters stay nearly unchanged between 
successive training periods. As a result, such time domain channel equalizers are not well 
equipped to confront fast changing channels. 

Fortunately, we do have an alternative. We have shown (in See. 12.7) that OFDM can con¬ 
vert a frequency-selective channel into a parallel group of flat channels. When the underlying 
channel is fast fading and frequency selective, OFDM can effectively converts it into a bank 
of fast flat-lading channels* As a result, means to combat fast flat-fading channels such as code 
interleaving can now be successfully applied to fast frequency-selective fading channels. 

We should note that for fast fading channels, another very effective means to combat the 
fading effect is to introduce channel diversity. Channel diversity allows the same transmit¬ 
ted data to be sent over a plurality of channels. Channel diversity can be achieved in the time 
domain by repetition, in the frequency domain by using multiple bands, or in space by applying 
multiple transmitting and receiving antennas. Because both time diversity and frequency diver¬ 
sity occupy more bandwidth, spatial diversity in the form of multi pie-input-multi pie-output 
(MTMO) systems has been particularly attractive recently* Among recent wireless standards, 
Wi-Fi (IEEE 802.1 In), WiMAX (IEEE 802.16e), and cellular LTE (long-term evolution) have 
all adopted OFDM and MIMQ technologies to achieve much higher data rate and better 
coverage* We shall present some fundamental discussions on MIMO in Chapter 13, 


12.12 MATLAB EXERCISES 

We provide three different computer exercises in this section; all model a QAM communica¬ 
tion system that modulates data using 16-QAM constellation. The 16-QAM signals then pass 
through linear channels with ISI and encounter additive white Gaussian noise (AWGN) at the 
channel output. 


COMPUTER EXERCIESE 12*1: 16-QAM LINEAR EQUALIZATION 

The first MATLAB program, Exl2_l *m, generates 1,000,000 points of 16-QAM data for transmission. 
Each QAM requires T as the symbol period. The transmitted pulse shape is a root-raised cosine with a 
roll-off factor of 0*5 [Eq. (12.23)]. Thus the bandwidth at the baseband is 0.15 fT Hz. 

% Matlab Program <Exl2_l.m> 

% This Matlab exercise <Exl2_l.m> performs simulation of 
% linear equalization under QAM-16 baseband transmission 
% a multipath channel with AWGN. 
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% Correct carrier and synchronization is assumed. 

% Root^raised cosine pulse of rolloff factor = 0.5 is used 
% Matched filter is applied at the receiver front end. 

% The program estimates the symbol error rate (SER) at different Eb/N 
clear;clf; 

L=1000000; % Total data symbols in experiment is 1 million 

% To display the pulse shape, we oversample the signal 
% by factor of f_ovsamp=8 

f_ovsamp=8; % Oversampling factor vs data rate 

delay_rc=4; 

% Generating root-raised cosine pulseshape (rolloff factor = 0*5) 
prcos=rcosfIt([ 1 ], 1, f_ovsamp, 'sqrt', 0.5, delay_rc)- % RRC pulse 
prcos=prcos[1:end-f_ovsamp+l); % remove 0's 

prcos=prcos/normfprcos); % normalize 

pcmatch=prcos(end:-1:1); % MF 

% Generating random signal data for polar signaling 
s_data=4*round(rand(L,1))+ 2 *round(rand(L ( 1))-3 + * * - 
+ j *(4*round(rand(L,1))+2 *round(rand(L,1))-3); 

% upsample to match the "oversampling rate' (normalize by 1/T) t 
% It is f_ovsamp/T (T=l is the symbol duration) 
s_up=upsample[s_data,f_ovsamp); 

% Identify the decision delays due to pulse shaping 
% and matched filters 
delayrc=2*delay_rc* f_ovsamp; 

% Generate polar signaling of different pulse-shaping 
xrcos=conv(s_up,prcos); 

[c_num,c_den] = cheby2(12,20,(1+0.5)/8); 

% The next commented line finds frequency response 
%[H,fnlz]=freqz(c_num,c_den,512,8); 

% The lowpass filter is the Tx filter before signal is sent to channel 
xchout=f ilter (c_nuin, c_den, xrcos) ; 

% We can now plot the power spectral densities of the two signals 
% xrcos and xchout 

% This shows the filtering effect of the Tx filter before 
% transmission in terms of the signal power spectral densities 
% It shows how little lowpass Tx filter may have distorted the signal 
plotPSD_comparison 

% Apply a 2-ray multipath channel 

mpath=[l 0 0 -0*65]; % multipath delta(t)-0.65 delta(t-3T/8) 

% time-domain multipath channel 
h=convfconv[prcos,pcmatch),mpath); 
hscale=norm(h); 

xchout=conv(mpath,xchout); % apply 2-ray multipath 

xrxout=conv(xchout,pcmatch); % send the signal through matched filter 

% separately from the noise 

delaychb=delayrc+3; 

out_mf=xrxout(delaychb+1:f_ovsamp:delaychb+L*f_ovsamp); 
clear xrxout; 
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% Generate complex random noise for channel output 
noiseq=randn(L*f_ovsamp,1)+ j*randn(L*f_ovsamp,1); 

% send AWGN noise into matched filter first 
noiseflt = fliter(pcmatch,[1],noiseq); clear noiseq; 

% Generate sampled noise after matched filter before scaling it 
% and adding to the QAM signal 
noisesamp=noisefIt(1:f_ovsamp:L*f_ovsamp,1); 

clear noiseq noiseflt; 

Es=10*hscale; % symbol energy 

% Call linear equalizer receiver to work 
linear_eq 

for ii=l:10; 

Eb2Waz(ii)=2*ii-2; 

Q(ii)=3*0*5*erfc(sqrt( (2*ltT fEb2Naz(ii)*0.1)/5}/2)); 
^Compute the Analytical BER 
end 

% Now plot results 
plotQAM_results 


The transmission is over a two-ray multipath channel with impulse response 

h(t)=g(t)-0.65g(t-3T/S) 

where g(t) is the response of a low-pass channel formed by applying a type II Chebyshev filter of 
order 12, a stopband gap of 20 dB> and bandwidth of 0.75/T Hz, The impulse response of this channel 
is shown in Fig* 12*19. 

The main program Exl2_l .mwill cal) a subroutine program plotPSD_comparison ,m to first 
generate the power spectral densities of the transmitted signal before and after the low-pass Chebyshev 
filter* The comparison in Fig* 12,20 shows that the root-raised-cosine design is almost ideally band- 
limited, as the low-pass channel introduces very little change in the passband of the transmitted signal 
spectrum* This means that the multipath environment is solely responsible for the ISI effect. 

% MATLAB PROGRAM <plotPSD_COmparison.m> 

% This program computes the PSD of the QAM signal before and after it 
% enters a good chebyshev lowpass filter prior to entering the channel 
% 

[Pdfy,fq]=pwelch(xchout,[],[J,1024,8,'twosided'}; % PSD before 
Tx filter 

IPdfx,fp]=pwelch(xrcos,[j,[],1024,8,'twosided'); % PSD after 

Tx filter 
figure[1]; 

subplot(211};semilogy(fp-f_ovsamp/2 H fftshift(Pdfx), 'b-'); 
axis([-4 4 1 *e-10 1.2e0]) ; 

xlabel('Frequency (in unit of 1/T_s)');ylabel('Power Spectrum'}; 
title('(a) Lowpass filter input spectrum') 
subplot(212);semilogy(fq-f_ovsamp/2,fftshift(Pdfy),'b-'); 
axis([-4 4 1.e-10 l,2e0]); 

xlabel['Frequency (in unit of 1/T_s)');ylabel('Power Spectrum'); 
title( H (b) Lowpass filter output spectrum') 
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After a matched filter has been applied at the receiver (root-raised cosine), the QAM signal will 
be sampled, equalized, and decoded. The subroutine program linear_eq. m designs a T -spaced finite 
1 ength MM S E eq u al i zer of order M = 8 as desc ribed i n Sec. 1 2.3 [ Eq, ( 1 2,4 3 b)]. The eq uali zer i s designed 
by applying the first 200 QAM symbols as training data. The equalizer filters the matched filter output 
before making a 16-QAM decision according to the decision region of Fig. 10,24b in Chapter 10, 


% MATLAB PROGRAM <1inear_eq*m> 

% This is the receiver part of the QAM equalization example 


Ntrain=20G; 

Neq=8; 
u = 0; 

3ERneq=[] ; 

SEReq=[]; 
for ±=1:13, 

Eb2N ( i ) =i*2-1; 

Eb2N_num=10"(Eb2N(i)/10); 
Var_n=Es/(2*Eb2N_num); 
signois=sqrt[Var_n/2 ); 
zl = out_mf -fsignois*noisesamp 


% Number of training symbols for Equalization 
% Order of linear equalizer (=length-l) 

% equalization delay u must be <= Neq 


%(Eb/N in dB) 

% Eb/N in numeral 
%1/SNR is the noise variance 
% standard deviation 
% Add noise 


Z = toeplitz(zl(Neq+1:Ntrain) , zl(Neq+1:-1:1}}; % signal matrx for 

% computing R 

dvec=[s_data{Neq+l^u:Ntrain-u)]? % build training data vector 

f=pinv(S r *Z )*Z"*dvec; % equalizer tap vector 

dsig=filter[f,1,zl); % apply FIR equalizer 

% Decision based on the Re/lm parts of the samples 
deq=sign(real(dsig (1 :L) ))+sign(real(dsig( 1 : L) )-2) + ,* * 
sign(real(dsig( 1: L) ) +2 ) + ,,- 

j *(sign(imag(dsig(1:L)))+sign[imag(dsig(1:L))-2) + t t . 
sign(imag(dsig(1:L})+2)}; 

% Now compare against the original data to compute SER 
% (1) for the case without equalizer 
dneq=sign(real(zl(1: L) ) ) +sign(real(zl(1:L))-2]+., ( 
sign(real(zl(1: L) ) +2 } + ,*, 

j *(sign[imag(dsig{l:L)))+sign(imag(zl(l:L))-2) + ttl 
sign(imag(zl(1:L})+2)); 

SERneq=[SERneq;sum(abs(s_data~=dneq))/(L)]; 

% (2) for the case with equalizer 
SEReq=[SEReq;sum(s_data~=deq)/L]; 

end 


Once the linear equalization results are available, the main program Exl2_l.m calls another 
subroutine program, plotQAM_results .m, to provide illustrative figures. In Fig. 12.21, the noise- 
free eye diagram of the in-phase component at the output of the receiver matched filter before sampling 
shows a strong ISI effect. The QAM signal eye is closed and, without equalization, a simple QAM 
decision leads to very high probabilities of symbol error (also known as symbol error rate). 

% MATLAB PROGRAM <plotQAM_resultS + m> 

% This program plots symbol error rate comparison before and after 
% equalization 
% 

% 


figure(2) 


constellation points 
eye-diagrams before equalization 
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Figure 12,21 

Noise-free eye 
diagram of the 
in-phase [real] 
component at the 
receiver [after 
matched filter) 
before sampling: 
the eyes ore 
closed, and ISf 
will lead to 
decision errors. 
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subplot(111) 

figber=semilogy(Eb2Naz,Q,'k-',Eb2N,SERneq,'b-o',Eb2N,SEReq,'b-v' 
axis([0 26 .99e-5 1]); 

legend('Analytical', "Without equaliser', "With equalizer'); 
xlabel('E_b/N (dB)");ylabel( 1 Symbol error probability') ? 
set(figber, 'Linewidth',2 >; 

% Constellation plot before and after equalization 
figure f3) 
subplot(121) 

plot(real{zl(1:min(L,4000)}),imag(zl(1:min(L,400Q))),'.' ); 

axis("square") 

xlabel('Real part') 

title{'(a) Before equalization 1 ) 

ylabel("Imaginary part')■ 

subplot(122} 

plot(real(dsig(l:min(L,400G})) r imag(dsig(l:min(L,400Q)) }, ' * 1 }; 
axis["square') 

title("(b) After equalization") 

xlabel("Real part r ) 

ylabel('Imaginary part"}; 

figure(4} 

t=length(h ); 

plot( [ 1:t]/f_ovsamp,h) ; 

xlabel("time (in unit of T_s)') 

title('Multipath channel impulse response '); 

% Plot eye diagrams due to multipath channel 
eyevec-conv(xchout,prcos); 

eyevec=eyevec(delaychb+1: (delaychb+800)* f_ovsamp) ; 
eyediagram(real(eyevec),16,2); 
title{'Eye diagram (in-phase component)"}; 
xlabel("Time (in unit of T_s) ") ; 
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Figure 12.22 

Scatter plots of 
signal samples 
before (a] and 
after (b| linear 
equalization at 
E b /A r ^ 26 dB 
demonstrate 
effective IS I 
mitigation by the 
linear equalizer. 



4 I-.-.-■-1 

4 -2 0 2 4 

Real part 



Real part 

(b) 


Figure 12.23 

Symbol error 
rate (SER] 
comparison 
before and after 
linear 

equalization 
demonstrates its 
effectiveness in 
combating 
multipath 
channel 1ST. 



EJ. A\ dB 


We can suppress a significant amount of IS1 by applying the linear equalizer to the sampled matched 
filter output. Figure 12.22 compares the “scalier plot” of signal samples before and after equalization 
at Efr/Af = 26 dB. The contrast illustrates that the equalizer has effectively mitigated much of the IS I 
introduced by the multipath channel. 

The program linea.r_eq. m also statistically computes the symbol error rate (SER) at different 
SNR levels. It further computes the ideal SER according to IS 1-free AWGN channel (Chapter 10) and, 
for comparison, the SER without equalization. The results shown in Fig. 12.23 dearly demonstrate the 
effectiveness of linear equalization in this example. 
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COMPUTER EXERCISE 12.2: DECISION FEEDBACK EQUALIZATION 

In this exercise, we use the main MATLAB program, Exl2_2 *m, to generate the same kind of data as 
in the last exercise. The main difference is that we adopt a slightly different two-ray multipath channel 

h(t) = g(t)-Q£3g(t^3Tf$) 

in which the ISI is much more severe. At the receiver, instead of using linear equalizers, we will implement 
and test the decision feedback equalizer (DFE) as described in Sec. 12.6. For simplicity, we will implement 
only a DFE feedback filter, without using the FEW filter 

% Matlab Program <Exl2_2.m> 

% This Matlab exercise <Exl2_2.m> performs simulation of 
% decision feedback equalization under QAM-16 baseband transmission 
% a multipath channel with AWGN. 

% Correct carrier and synchronization is assumed, 

% Root-raised cosine pulse of rolloff factor = 0,5 is used 
% Matched filter is applied at the receiver front end. 

% The program estimates the symbol error rate {SER) at different Eb/N 
Clear;elf; 

L-100000; % Total data symbols in experiment is 1 million 

% To display the pulse shape, we oversample the signal 
% by factor of f_ovsamp=S 

f_ovsamp=8; % Oversampling factor vs data rate 

delay_rc=4; 

% Generating root-raised cosine pulseshape {rolloff factor = 0.5) 
prcos^rcosfIt(t 11,1, f_ovsamp, 'sqrt r , 0.5, delay_rc); % RRC pulse 
prcos=prcos(1:end-f_ovsamp+l); % remove 0's 

prcos=prcos/norm(prcos); % normalize 

pcmatch=prcos(end:-1:1); % MF 

% Generating random signal data for polar signaling 
s_data=4*round(rand(L,1))+2 *round[rand(L,1))-3+... 

+ j *{4*round{rand(L,1))+2 *round(rand(L,1))-3); 

% upsample to match the 'oversampling rate' (normalize by 1/T). 

% It is f_ovsamp/T (T=l is the symbol duration) 
s_up=upsample(s_data,f_ovsamp); 

% Identify the decision delays due to pulse shaping 
% and matched filters 
delayrc = 2 *delay_rc* f_ovsamp; 

% Generate polar signaling of different pulse-shaping 
xrcos=conv{s_up,prcos); 

[c_num,c_den] = cheby2(12,20,{1+0.5)/8); 

% The next commented line finds frequency response 
%[H,fnlz]=freqz(c_num,c_den,512,8); 

% The lowpass filter is the Tx filter before signal is sent to channel 
xchout=filter(c_num,c_den,xrcos); 

% We can now plot the power spectral densities of the two signals 
% xrcos and xchout 

% This shows the filtering effect of the Tx filter before 
% transmission in terms of the signal power spectral densities 
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% It shows how little lowpass Tx filter may have distorted the signal 
plotPSD_comparison 

% Apply a 2-ray multipath channel 

mpath=[l 0 0 -0.83J; % multipath delta(t)-0.83 delta(t-3T/8} 

% or use mpath=[l 0 0 .45]; 

% time-domain multipath channel 
h-conv(conv(prcos,pcmatch),mpath); 
hscale=norm (h) ; 

xchout=conv(mpath,xchout); % apply 2-ray multipath 

xrxout=conv(xchout,pcmatch}; % send the signal through matched filter 

% separately from the noise 

delaychb=delayrc-H3 ; 

out_mf=xrxout{delaychb+1:f_ovsamp:delaychb+L*f_ovsamp)■ 
clear xrxout; 

% Generate complex random noise for channel output 
noiseq=randn(L*f_ovsamp r 1> +j*randn(L*f_ovsamp,1) ; 

% send AWGN noise into matched filter first 
noiseflt=filter(pcmatch,[1] ; noiseq]; clear noiseq; 

% Generate sampled noise after matched filter before scaling it 
% and adding to the QAM signal 
noisesamp=noisefIt[1:f_ovsamp:L* f_ovsamp r 1); 

clear noiseq noiseflt; 

Es=10*hscale; % symbol energy 

% Call decision feedback equalizer receiver to work 
df e 

SERdfe=SEReq; 
for ii=1:9; 

Eb2l\Faz(ii) -2 *ii; 

Q(ii)=3*0.5*erfc(sqrt((2 * 10"{Eb2Naz(ii)*0.1}/5)/2)}; 

%Compute the Analytical BER 
end 

% use the program plotQAM_results to show results 

plotQAM_results 

linear_eq 


At the receiver, once the signal has passed through the root-raised-cosine matched filter, the T- 
spaced samples will be sent into the DFE. The subroutine program df e .m implements the DFE design 
and the actual equalization. The DFE design requires the receiver to first estimate the discrete channel 
response. We use the first 200 QAM symbols as training data for channel estimation. We then compute 
the SER of the DFE output in df e .m. The necessary program df e .m is given here 

% MATLAB PROGRAM <dfe.m> 

% This is the receiver part of the QAM equalization 
% that uses Decision feedback equalizer (DFE) 

% 

Ntrain=2G0; % Number of training symbols for Equalization 

Uch=3; % Order of FIR channel (=length-l) 

3EReq=[]? SERneq=[]; 
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for ±=1:13, 

Eb2N(i)=i*2-l; 

Eb2N_num=l 0~ (Eb2N (i } /10) ; 
Var_n=Es/ (2*Eb2N_num) ; 

signois^sqrt(Var_n/2); 
zl=out_mf+signois*noisesamp; 


% { Eb / 1ST in dB) 

% Eb/N in numeral 
%1/SNR is the noise 
variance 

% standard deviation 
% Add noise 


Z=toeplitz(s_data{Nch+1:Ntrain),s_data(Nch+1:-1:1}>; 

% signal matrx for 

% computing R 

dvec=[zl(Nch+1iNtrain)J; 

% build training data vector 
h_hat=pinv(z'*Z)*Z '*dvec; 

% find channel estimate tap vector 
zl=zl/h_hat(1}; 

% equalize the gain loss 
h_ha t=h_ha t(2:end)/h_hat(1) ? 

% set the leading tap to 1 

t eedbk=zeros(1,Nch)? 
for kj=l:L, 

zfk=feedbk*h_hat; % feedback data 

dsig(kj} = zl(kj)-zfk; % subtract the feedback 

% Now make decision after feedback 

d_temp = sign (real (dsig (kj ) ) ) + sign (real (dsig (kj ) ) -2) +■. , t 
sign(real(dsig(kj))+2)+-.. t 

j *(sign(imag(dsig(kj)))+sign(imag(dsig{kj) }-2) + . . , 
sign(imag{dsig(kj)}+2)); 
feedbk=[d_temp feedbk(1:Nch-1}]; 

% update the feedback data 

end 

% Now compute the entire DFE decision after decision feedback 
dfeq=sign(real(dsig)}+sign{real(dsig)-2}+... 
sign(real(dsig)+2)+♦ * . 

j *(sign(imag(dsig))+sign(imag(dsig)-2) + , . . 
sign(imag(dsig)+2}); 
dfeq=reshape(dfeq,L,1}; 

% Compute the SER after decision feedback equalization 
SEReq=[SEReq;sum{S_data{1:L)"=dfeq)/L]; 

% find the decision without DFE 

dneq=sign(real(zl{1:L)))+sign(real(zl(l:L))-2)+ t ,. 
sign{real{zl(1:L)) +2} +. * . 

j *(sign(imag(zl{1:L)))+ sign(imag(zl(1:L})-2} + . . . 
sign(imag{zl(1:L))+2}); 

% Compute the SER without equalization 
SERneq=[SERneq;sum(abs(s_data~=dneq))/(L)]; 

end 


Once the SER of the DFE has been determined, it is compared against the SER of the linear 
equalization from the last exercise, along with the SER from ideal AWGN channel and the SER from 
a receiver without equalization. We provide the results in Fig* 12,24. From the comparison, we can see 
that both the DFE and the linear equalizer are effective at mitigating channel ISI, The linear equalizer 
is slightly better at lower SNR because the DFE is more susceptible to error propagation (Sec. 12.6) at 
lower SNR. 
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Figure 12,24 

Symbol error rote 
(5ER) compari¬ 
son of DFE, 
linear equali¬ 
zation, and 
under ideal 
channel. 


10 ° 



COMPUTER EXERCISE 12,3: OFDM TRANSMISSION OF QAM SIGNALS 

In the example, we will utilize OFDM for QAM transmission. We choose the number of subcarriers (and 
the FFT size) as N = 32, We iet the finite impulse response (FIR) channel to be 

channel=[0.3 -0.5 0 1 .2 -0.3] 

The channel length is 6 (L = 5 in Section 12,7). For this reason, we can select the cyclic prefix length to 
be the minimum length of L = 5. 

% Matlab Program <Exl2_3.m> 

% This Matlab exercise <Exl2_3.m> performs simulation of 
% an OFDM system that employs QAM-16 baseband signaling 
% a multipath channel with AWGN. 

% Correct carrier and synchronization is assumed. 

% 32 subcarriers are used with channel length of 6 
% and cyclic prefix length of 5. 
clear;elf; 

L=16GOOOO; % Total data symbols in experiment is 1 million 

Lfr=L/32; % number of data frames 

% Generating random signal data for polar signaling 
s_data = 4*round(rand(L,1))+2 *round[rand(L,1))-3+.. . 

+ j* (4 * round(rand (L, 1))+2*round(rand(L,1))-3); 


channel=[0.3 -0.5 0 1 .2 -0.3]; 
hf=fft(channel,32}; 

p_data=reshape(s„data,32,Lfr}; 

p_td=ifft(p_data); 

p_cyc= [p_td(end-4:end,:);p_td]; 


% channel in t-domain 
% find the channel in f-domain 

% S/P conversion 

% IFFT to convert to t-domain 
% add cyclic prefix 
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s_cyc=ieshape(p_cyc,37 *Lfr H 1) ; 
Psig=10/32; 

chsout=filter[channel,1,s_cyc); 
clear p_td p_cyc s_data s_cyc; 
noiseq=(randn(37*Lfr,l)+j*randn( 
£EReq=[]; 


% P/S conversion 

% average channel input power 
% generate channel output signal 
% release some memory 
*Lfr,1)); 


for ii=l:31, 

SNR(ii}=ii-l; 

Asig = sqrt (Psig*10^ (- SNR ( ii > / 10 ) 
x_out=chsout+Asig*noiseq; 
x_para=reshape(x_out,37,Lfr); 
x_disc=x_para[ 6 ; 37 ,:); 
xhat_para^fft(x_disc); 


% SNR in dB 
) *norm[channel}; 

% Add noise 

% S/P conversion 
% discard tails 
% FFT back to f-domain 


z_data=inv(diag(hf) )*xhat_j?ara; % f-domain equalizing 

% compute the QAM decision after equalization 

deq=sign(real(z_data))+sign(real(z__data)-2)+sign(real(z_data)+2)+,,. 

j *(sign(imag(z_data)}+ sign(imag[z_data)-2)+sign(imag(z_data)+ 2) ) ; 
% Now compare against the original data to compute SER 
SEReq=[SEReq sum(p_data~=deq,2)/Lfr]; 
end 


for ii = l: 9, 

SNRa(ii}=2*ii-2; 

Q(ii)=3*0 * 5*erfc (sqrt((2 * 10^(SNRa(ii) *0,l)/5)/2)); 
%Compute the Analytical BER 
end 


% call another program to display OFDM Analysis 
ofdmAz 


The main MATLAB program Exl2_3 . m completes OFDM modulation, equalization, and detec¬ 
tion, Because the subcarriers (subchannels) have different gain and, consequently, different SNR, each 
of the 32 subcarriers may have a different SER> Thus, simply comparing the overall SER does not tell the 
full story. For this reason, we can call another program ofdmAz , m to analyze the results of this OFDM 
system, 

% MATLAB PROGRAM <ofdmAz.m> 

% This program is used to analyze the OFDM subcarriers and their 
% receiver outputs, 

% Plot the subcarrier gains 

figure(2); 

stem(abs(hf)) ; 

xlabel('Subcarrier label'); 

title('Subchannel gain'); 

% Plot the subchannel constellation scattering after OFDM 
figure(3); 

subplot(221};p1ot(z_data(1,1:800),'.') % subchannel 1 output 

ylabel('Imaginary')? 

titie('(a) Subchannel 1 output');axis('square'); 

subplot(222};plot(z_data(10,1:800),'.'); % subchannel 10 output 
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Figure 12.25 

Comparison of 
the in channel 
gain for 32 
subcarriers. 



ylabel{'Imaginary")? 

titie('(b) Subchannel 10 output');axis('square')■ 

subplot(223);plot(z_data(15,1:800),'.')? % subchannel 15 output 

xlabel('Real')?ylabel('Imaginary')? 

title (Me) Subchannel 15 output' ) ; axis (' square r ) ; 

subplot (224} ;plot { z_data ( :,1:800 ) , 'b* '); % mixed subchannel output 

xlabel('Real')?ylabel('Imaginary'); 

titie('(d) Mixed OFDM output');axis('square'); 

% Plot the average OFDM SER versus SER under "ideal channel" 

% By Disabling 5 poor subcarriers, average SER can be reduced* 
figure(4); 

figc=semilogy(SNRa,Q,'k-',SNR,mean(SEReq),'b-o',.*. 

SNR,mean([SEReq(l:14,:);SEReq(20:32,:)]],'b-s')? 
set(figc, r LineWidth',2); 

legend('Ideal channel'Using all subcarriers'MDisabling 5 poor 
subcarriers') 

title('Average OFDM SER'); 
axis([1 30 l.e-4 1]);hold off; 

xlabelf'SNR (dB)');ylabel('Symbol Error Rate (SER)'); 

First, we display the subchannel gain H[n] in Fig. 12.25, We can clearly see that, among the 32 
subchannels, the 5 near the center have the lowest gains and hence the lowest SNR* We therefore expect 
them to exhibit the worst performance. By fixing the average channel SNR at 30 dB, we can take a quick 
peek at the equalizer outputs of the different subcarrier equalizers. In particular, we select subchannels l, 
H), and 15 because they represent the moderate, good, and poor channels, respectively. Scatter plots of the 
output samples (Fig, 12.26a-c) clearly demonstrate the quality contrast among them. If we do not make 
any distinction among subchannels, we can see from Fig, 12.26d that the overall OFDM performance is 
dominated mainly by the poor subchannels. 

We can also look at the SER of all 32 individual subcarriers in Fig, 12*27, We see very dearly that 
the 5 worst channels are responsible for the 5 worst SER performances* Naturally if we average the SER 
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Figure 12.28 

Average SER of 
the OFDM 
subcarriers 
before and after 
disabling five 
worst channels. 



across all 32 subchannels, the larger SERs tend to dominate and make the overall SER of the OFDM 
system much higher 

To make the OFDM system more reliable, one possible approach is to apply bit loading. In fact, one 
extreme case of bit loading is to disable all the poor subchannels (i.e., to send nothing on the subchannels 
with very low gains). We can see from the SER comparison of Fig, 12.28 that by disabling 5 of the worst 
channels among the 32 subcarriers, the overall SER is significantly reduced (improved). 
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PROBLEMS 

12.1- 1 In a QAM transmission of symbol rate \/T = 1 MHz, assume that p(t) is a raised-cosine pulse 

with roll-off factor of (1th The carrier frequency in use is 2.4 GHz. 

(a) Derive the resulting baseband pulse part q(t) when the multipath channel impulse response 
is given by 

0.955(0 - 0.3S(f - r/2) 

(b) Show whether the eye is open for QPSK transmission in part (a) when the channel outputs 
are sampled at t — kT , 

12.2- 1 Consider the signal transmission model of Prob. 12.1-1. 

(a) Determine the matched filter for the equivalent baseband pulse resulting from the multipath 
channel. 

(b) Determine the equivalent discrete time linear system transfer function H{z) between the 
QAM input symbols and the matched filter output sampled at t = kT. 

12.2- 2 In a digital QAM system, the received baseband pulse shape is q(i) = A (^). The channel 

noise (before the matched filter) is AWGN with spectrum of jV/2. 

(a) Find the power spectral density of the noise w(0 at the matched filter output 

(b) Determine the mean and the variance of the sampled noise wffcT] at the matched filter 
output. 

(c) Show whether the noise samples w[£T] are independent* 

12.3- 1 In a BPSK baseband system, the discrete time channel is specified by 

H{z) = 1 + 0.6; _1 

The received signal samples are 

z\k] = H(z)s k +w[k] 

The BPSK signal is r/, = ±[ with equal probability. The discrete channel noise w [k] is additive 
white Gaussian with zero mean and variance jV/2 such that the E^jA r =18. 

(a) Find the probability of error if z[k] is directly sent into a BPSK decision device. 

(b) Find the probability of error if z[£] first passes through a zero-forcing equalizer before a 
BPSK decision device. 

12.3- 2 Repeat Prob. 12.3-2 if the discrete channel 

H(z) = 1 +0.9?" 1 

12.3- 3 Compare the BER results of Probs. 12.3-1 and 12.3-2. Observe the different depth of thechannel 

spectral nulls and explain their BER difference based on the different the noise amplification 
effect. 



732 


DIGITAL COMMUNICATIONS UNDER LINEARLY DISTORTIVE CHANNELS 


12.3- 4 For the channel of Prob. 12.3-1, find the response of a six-tap MMSE equalizer. Determine the 

resulting minimum MSE. What is the corresponding MSE if the ZF equalizer is applied instead? 

12.3- 5 Repeat Prob, 12.3-3 for the FIR channel of Prob, 12,3-2, 

12.4- 1 Tn a fractionally sampled channel, the sampling frequency is chosen to be 2/7 (i.e., there are 

two samples for every transmitted symbol s k ), The two sampled subchannel responses are 

ffi(z)= 1 + 0.9; -1 H 2 (z) = -0.3 + 0.5J- 1 

Both subchannels have additive white Gaussian noises that are independent with zero mean and 
identical variance = 0.2. The input symbol is a PAM-4 with equal probability of being 
(±F ±3), 

(a) Show that F}(z) — 0,3 and Fjiz) = 1 form a zero-forcing equalizer, 

(b) Show that F\ (z) = 1 and 7*2 (z) = —1.8 also form a zero-forcing equalizer. 

(c) Show which of the two previous fractionally spaced ZF equalizers delivers better per¬ 
formance. This result shows that ZF equalizers of different delays can lead to different 
performance. 

12.4- 2 For the same system of Prob, 12,4-1, complete the following. 

(a) Find the ZF equalizers of delays 0, 1, and 2, respectively w'hen the ZF equalizer filters have 
order L that is, 

=^i'[0] -H//[l]z _l i= ], 2 

(b) Find the resulting noise distribution at the equalizer output for each of the three fractionally 
spaced ZF equalizers. 

(c) Determine the probability of symbol error if hard PAM decision is taken from the equalizer 
output. 

12.6- 1 In a DFE for binary polar signaling, s k = ± I with equal probability. The feedforward filter 

output d[£] is given by 

d[k] = s k ^ 2 + 0.8^ 3 4- w[k\ 

where w[£] is white Gaussian with zero mean and variance 0.04. 

(a) Determine the DFE filter coefficient. 

(b) Find the DFE output BER when the decisions in feedback are error free. 

(c) If the decision device is not error free, then there will be error propagation. Find the prob¬ 
ability of error of the next decision on symbol fy-2 when the previous decision is 
known to be wrong. 

12.7- 1 Prove that Wn ■ = //vx ; v- 

12.7- 2 A cyclic matrix is a matrix that is completely specified by its first row (or column). Row i is 

a circular shift of the dements in row / — 1. In other words, if the first row of matrix C is 

ct ],..., fl/v-b a N* second row is a^, a\ .and so on. Prove that any cyclic 

matrix of size N x N can be diagonalized by and IT ™ 1 , that is, 

W N -C-Wy 1 = diagonal 
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12.7- 3 Consider an FIR channel with impulse response 

h \01 = TO, A[l] = -0.5, h[ 2] = 0.3 

The channel noise is additive white Gaussian with spectrum A r /2. Design an OFDM system 
with N = 16 by (a) specifying the length of the cyclic prefix; (b) determining the N subchannel 
gains; (c) deriving the bit error rate of each subchannels for BPSK modulations; (d) finding the 
average bit error rate of the entire OFDM system, 

12.7- 4 Consider an FIR channel of order up to L. First, we apply the usual 1DFT on the source data 

vector via 



Next, instead of applying a cyclic prefix as in Eq . (12,64b). wc insert a string of L zeros in front 
of every N data before transmission as in 


s N 

TV-1 


A 'I 

0 


{N + L) x 1 


0 


This zero-padded data vector is transmitted normally over the FIR channel {/d£]J. At the receiver 
end, we stack up the received symbols (z[«]] into 


zEJV] 1 


- 0 ■ 

z[N - 1] 





0 

z\L) 

+ 

z.[N + L\ 

. «[l] - 


-Z[N + 1 ]_ 


Prove that 




H[N] 


H[N- 1] 


\ 


tf[+ 1]J 


This illustrates the equivalence between zero padding and cyclic prefix in OFDM. 

12.7-5 Show that in AWGN channels, cyclic OFDM and zero-padded OFDM achieve identical SNRs 
for the same channel input power 
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A mong all the means of communication discussed thus far, none produces error-free 
communication. We may be able to improve the accuracy in digital signals by reducing 
the error probability P e . But it appears that as long as channel noise exists, our com¬ 
munications cannot be free of errors. For example, in all the digital systems discussed thus 
far, P e varies as e~ kEb asymptotically. By increasing £)>, the energy per bit, we can reduce P e 
to any desired level Now, the signal power is S{ — E b Rh^ where is the bit rate. Hence, 
increasing E h means either increasing the signal power S{ (for a given bit rate), decreasing the 
bit rate R b (for a given power), or both. Because of physical limitations, however, Si cannot 
be increased beyond a certain limit Hence, to reduce P e further, we must reduce the rate 
of transmission of information digits. Thus, the price to be paid for reducing P e is a reduction 
in the transmission rate. To make P e approach 0, R& also approaches 0. Hence, it appears that 
in the presence of channel noise it is impossible to achieve error-free communication. Thus 
thought communication engineers until the publication of Shannon's seminal paper 1 in 1948. 
Shannon showed that for a given channel, as long as the rate of information digits per second to 
be transmitted is maintained within a certain limit determined by the physical channel (known 
as the channel capacity), it is possible to achieve error-free communication. That is, to attain 
P e -> 0, it is not necessary to make R b 0, Such a goal (P e 0) can be attained by 
maintaining R b below C, the channel capacity (per second). The gist of Shannon's paper is 
that the presence of random disturbance in a channel does not, by itself, define any limit on 
transmission accuracy. Instead, it defines a limit on the information rate for which an arbitrarily 
small error probability (P e 0) can be achieved. 

We have been using the phrase “rate of information transmission” as if information could 
be measured. This is indeed so. We shall now discuss the information content of a message as 
understood by our “common sense” and also as it is understood in the “engineering sense.” 
Surprisingly, both approaches yield the same measure of information in a message. 

13.1 MEASURE OF INFORMATION 

Commonsense Measure of Information 

Consider the following three hypothetical headlines in a morning paper: 

1. There will be daylighttomorrow. 

2. United States invades Iran. 


3. Iran invades the United States. 
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The reader will hardly notice the first headline unless he or she lives near the North or the 
South Pole. The reader will be very, very interested in the second. But what really catches 
the reader’s attention is the third headline. This item will attract much more interest than the 
other two headlines. From the viewpoint of“eonmion sense” the first headline conveys hardly 
any information; the second conveys a large amount of information; and the third conveys yet 
a larger amount of information. Tf we look at the probabilities of occurrence of these three 
events, we find that the probability of occurrence of the first event is unity (a certain event), 
that of the second is low (an event of small but finite probability), and that of the third is 
practically zero (an almost impossible event). If an event of low probability occurs, it causes 
greater surprise and, hence, conveys more information than the occurrence of an event of larger 
probability. Thus, the information is connected with the element of surprise, w r hich is a result 
of uncertainty, or unexpectedness. The more unexpected the event, the greater the surprise, 
and hence the more information. The probability of occurrence of an event is a measure of its 
unexpectedness and, hence, is related to the information content. Thus, from the point of view 
of common sense, the amount of information received from a message is directly related to the 
uncertainty or inversely related to the probability of its occurrence. If P is the probability of 
occurrence of a message and / is the information gained from the message, it is evident from 
the preceding discussion that when P 1,/ -> 0 and when P 0,/ oo, and, in general 
a smaller P gives a larger /* This suggests the following information measure; 

^log^ (13.1) 

Engineering Measure of Information 

We now r show that from an engineering point of view, the information content of a message 
is consistent with the intuitive measure [Eq* (13*1)]. What do we mean by an engineering 
point of view? An engineer is responsible for the efficient transmission of messages. For this 
service the engineer will charge a customer an amount proportional to the information to be 
transmitted. But in reality the engineer will charge the customer in proportion to the time that 
the message occupies the channel bandwidth for transmission. In short, from an engineering 
point of view, the amount of information in a message is proportional to the (minimum) time 
required to transmit the message. We shall now show that this concept of information also 
leads to Eq. (13.1). This implies that a message with higher probability can be transmitted in a 
shorter time than that required for a message with lower probability. This fact may be verified 
by the example of the transmission of alphabetic symbols in the English language using Morse 
code. This code is made up of various combinations of two symbols (such as a dash and a dot 
in Morse code, or pulses of height A and —A volts). Each letter is represented by a certain 
combination of these symbols, called the codeword, which has a certain length* Obviously, 
for efficient transmission, shorter codewords are assigned to the letters e , t, a, and o , which 
occur more frequently. The longer codewords are assigned to letters a\ k , q , and z, which occur 
less frequently. Each letter may be considered to be a message. It is obvious that the letters that 
occur more frequently (with higher probability of occurrence) need a shorter time to transmit 
(shorter codewords) than those with smaller probability of occurrence. We shall now show 
that on the average, the time required to transmit a symbol (or a message) with probability of 
occurrence P is indeed proportional to log (1/F), 

For the sake of simplicity, let us begin with the case of binary messages m\ and m 2 , which 
are equally likely to occur. We may use binary digits to encode these messages, representing 
m\ and m 2 by the digits 0 and 1 , respectively. Clearly, we must have a minimum of one binary 
digit (which can assume two values) to represent each of the two equally likely messages. Next, 
consider the case of the four equiprobable messages m\ , m 2 , m 3 , and If these messages are 
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encoded in binary form, we need a minimum of two binary digits per message. Each binary 
digit can assume two values. Hence, a combination of two binary digits can form the four 
codewords 00, 01, 10, 11, which can be assigned to the four equiprobable messages mjj% 
m 3 , and m 4 , respectively. It is clear that each of these four messages takes twice as much 
transmission time as that required by each of the two equiprobable messages and, hence, 
contains twice as much information. Similarly, we can encode any one of eight equiprobable 
messages with a minimum of three binary digits. This is because three binary digits form eight 
distinct codewords, which can be assigned to each of the eight messages. It can be seen that, 
in general, we need log 2 n binary digits to encode each of n equiprobable messages* Because 
all the messages are equiprobable, P, the probability of any one message occurring, is 1 (n. 
Hence, to encode each message (with probability P), we need log 2 (l/P) binary digits. Thus, 
from the engineering viewpoint, the information / contained in a message with probability of 
occurrence P is proportional to log 2 (1/P), 


I = k log 2 — (13.2) 

where k is a constant to be determined. Onee again, we come to the conclusion (from the engi¬ 
neering viewpoint) that the information content of a message is proportional to the logarithm 
of the reciprocal of the probability of the message. 

We shall now define the information conveyed by a message according to Eq. (13,2). The 
proportionality constant is taken as unity for convenience, and the information is then in terms 
of binary units, abbreviated bit (binary unit), 

1 

/ = log 2 - bits (13.3) 

According to this definition, the information / in a message can be interpreted as the 
minimum number of binary digits required to encode the message. This is given by log 3 (1 /P), 
where P is the probability of occurrence of the message. Although here w'e have shown this 
result for the special case of equiprobable messages, w'e shall show in the next section that it 
is true for non equiprobable messages also. 

Next, we shall consider the case of r-ary digits instead of binary digits for encoding. Each 
of the r-ary digits can assume r values (0, 1, 2,... ,r — 1). Each of n messages (encoded by 
r-ary digits) can then be transmitted by a particular sequence of r-ary signals. Because each 
r-ary digit can assume r values, k r-ary digits can form a maximum of r k distinct codewords. 
Hence, to encode each of the n equiprobable messages, we need a minimum of k = log r n r-ary 
digits^ But n — 1/P, where P is the probability of occurrence of each message. Obviously, 
we need a minimum of log r (1 /P) r-ary digits. The information / per message is therefore 


/ = log r — r-ary units 


(13.4) 


From Eqs. (13.3) and (13.4) it is evident that 


/ — log 2 — bits = log r — r-ary units 


* Here we are assuming that the number n is such that log 2 ft is an integer. Later on we shall observe that this 
restriction is not necessary. 

T Here again we are assuming that n is such that Iog r n is an integer. As we shall see later, this restriction is not 
necessary. 
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Hence,* 


1 ;--ary unit = log 2 rbils (13.5) 

A Note on the Unit of Information: Although it is tempting to use the r-ary unit as 
a general unit of information, the binary unit bit (r = 2) is commonly used in the literature. 
There is, of course, no loss of generality in using r = 2. These units can always be converted 
into any other units by using Eq, (13.5). Henceforth, unless otherwise stated, we shall use the 
binary' unit (bit) tor information. The bases of the logarithmic functions will be omitted, but 
will be understood to be 2. 


Average Information per Message: Entropy of a Source 

Consider a memoryless source m emitting messages mi, m 2 . m n with probabilities Pi, 

Pi, -.., P n , respectively ( P ; + P; — ■■■ — P,, — 1 ). A memoryless source implies that each 
message emitted is independent of the previous message(s). By the definition in Eq. (13.3) [or 
Eq. (13.4)], the information content of message m; is /,■, given by 


h 


log — bits 
h Pi 


(13.6) 


The probability of occurrence of m; is P;. Hence, the mean, or average, information per message 
emitted by the source is given by V'.' = . P ,/, bits. The average information per message of a 
source m is called its entropy, denoted by Ply mHence, 

U 

H(m ) = J]p,/i bits 

i=[ 

R 1 

= — bits (13.7a) 

i=i ^ 

n 

= -£>,'logP, bits (13.7b) 

f=l 


The entropy of a source is a function of the message probabilities. It is interesting to find the 
message probability distribution that yields the maximum entropy. Because the entropy is a 
measure of uncertainty, the probability distribution that generates the maximum uncertainty 
will have the maximum entropy. On qualitative grounds, one expects entropy to be maximum 
when all the messages are equi probable. We shall now show that this is indeed true* 


* In general, 


] r-ary unit = log 5 r j-ary units 

The 10-ary unit of information is called the hartley in honor of R. V, L. Hartley, 2 who was one of the pioneers 
(along with Nyquist^ and Carson) in the area of in formation transmission in the 1920s, The rigorous mathematical 
foundations of information theory, however, were established by C. E, Shannon 1 in 1948: 

1 hartley = log 2 10 = 3.32 bits 

Sometimes the unit nat is used: 


1 nat = log 2 = 1.44 bits 
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Because //(m) is a function of Pi , P 2 , ..,, the maximum value of H(m) is found from 
the equation dH{m)ldPi = 0 for i = 1, 2, with the constraint that 

l=Pi+P2 + '" + Pn-l+Pn (13,8) 


Because the function for maximization is 

n 

H(m) = -'£ / P i logP i (13.9) 

( = 1 

we need to use the Lag rang ian to form a new function 

n 

f(P[* Pi* ’ - Pa) = “ Pi Pj + a(Pi + P 2 + ' ■ ■ "h P n -\ + P n ~ 1) 

! — 1 

Hence, 

= — log Pj + X — log e 7 = 1 , 2 ,..,,/! 


Setting the derivatives to zero leads to 


2 X 

P ] =P 1 = ... = P„ = - 

e 

By invoking the probability constraint of Bq + (13,8), we have 


2 k 


n — = 1 
e 


Thus, 


Pi =p 2 = ■■■ = /»„ = - (13.10) 

n 

To show that Eq. (13T0) yields [//(ni)] max and not [//(m)]^, we note that when Pi = 1 and 
P 2 = P$ = ■ * - = P n = 0, H{ m) = 0, whereas the probabilities in Eq. (13.10) yield 

n 1 1 

H (m) = - V - log - — log n 
n n 
i=[ 

The Intuitive (Commonsense) and the Engineering Interpretations of Entropy: Ear¬ 
lier we observed that both the intuitive and the engineering viewpoints lead to the same 
definition of the information associated with a message. The conceptual bases, however, are 
entirely different for the two points of view. Consequently, we have two physical interpreta¬ 
tions of information. According to the engineering point of view\ the information content of 
any message is equal to the minimum number of digits required to encode the message, and, 
therefore, the entropy tf(m) is equal to the minimum number of digits per message required, 
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on the average, for encoding. From the intuitive standpoint, on the other hand, information is 
thought of as being synonymous with the amount of surprise, or uncertainty, associated with 
the event (or message). A smaller probability of occurrence implies more uncertainty about the 
event. Uncertainty is, of course, associated with surprise. Hence intuitively, the information 
associated with a message is a measure of the uncertainty (unexpectedness) of the message. 
Therefore, log (1/P,-) is a measure of the uncertainty of the message m,, and P;1og(l/P,) 
is the average uncertainty (per message) of the source that generates messages m 2 , , 

m fl with probabilities Pi, P 2 , >. - * P n - Both these interpretations prove useful in the qualitative 
understanding of the mathematical definitions and results in information theory. Entropy may 
also be viewed as a function associated with a random variable m that assumes values m \, 
m 2 , ...,m n with probabilities P(mi),P{m 2 ),_P(m H ): 

H(m) = J2 log = £> lo § 

Thus, we can associate an entropy with every discrete random variable. 

If the source is not memoryless (i.e., in the event that a message emitted at any time is 
dependent of the previous messages emitted), then the source entropy will be less than H{ m) 
in Eq. (13.9). This is because the dependence of a message on previous messages reduces its 
uncertainty. 


13.2 SOURCE ENCODING 

The minimum number of binary digits required to encode a message was shown to be equal to 
the source entropy log( 1/P) if all the messages of the source are equiprobable (each message 
probability is P). We shall now generalize this result to the case of non equiprobable messages. 
We shall show that the average number of binary digits per message required for encoding is 
given by H (m) (in bits) for an arbitrary probability distribution of the messages. 

Let a source m emit messages mi, m 2 , ..,, m n with probabilities Pj, P 2 , ,.., P n , respec¬ 
tively. Consider a sequence of N messages with N —* 00 . Let ki be the number of times 
message mi occurs in this sequence. Then according to the relative frequency interpretation 
(or law of large numbers), 


Thus, the message m, occurs NPi times in a sequence of N messages (provided N 00 ). 
Therefore, in a typical sequence ot AT messages, mi will occur NP\ times, m 2 will occur 
NP 2 times, ..., m n will occur NP n times. All other compositions are extremely unlikely to 
occur (P —> 0). Thus, any typical sequence (where N —> 00 ) has the same proportion of the 
n messages, although in general the order will be different. We shall assume a memoryless 
source; that is, we assume that the message is emitted from the source independently of the 
previous messages. Consider now a typical sequence Sn of N messages from the source. 

Because the n messages (of probability Pi, P 2 , ..., P n ) occur NP ] , NP 2 . NP n times, and 

because each message is independent, the probability of occurrence of a typical sequence 
is given by 


P{S N ) = (P,) NP '{P 2 ) NF *---(P n ) NP * 


(13.11) 
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Because all possible sequences of AT messages from this source have the same composition, 
all the sequences (of N messages) are equiprobabie, with probability P(Sn). We can consider 
these long sequences as new messages (which are now equiprobable). To encode one such 
sequence we need binary digits, where 


Av = log 


P(Sn) J 


binary digits 


(13.12) 


Substituting Eq. (13.11) into Eq. (13.12), we obtain 


Ln =NV'P i log -^-=NH (m) 


binary digits 


Note that Av is the length (number of binary digits) of the codeword required to encode N 
messages in sequence. Hence, L , the average number of digits required per message, is Ln/N 
and is given by 


L — — H(m) binary digits (13.13) 

Thus, by encoding N successive messages, it is possible to encode a sequence of source 
messages using, on the average, //(m) binary digits peT message, where //(m) is the entropy 
of the source message (in bits). Moreover, one can show that tf(m) is indeed, on the average, 
the minimum number of digits required to encode this message source. It is impossible to find 
any uniquely deeodable code whose average length is less than H( m). 4 * 5 


Huffman Code 

The source encoding theorem says that to encode a source with entropy H( m), we need, on the 
average, a minimum of H(m) binary digits per message. The number of digits in the codeword 
is the length of the codeword. Thus, the average word length of an optimum code is H{ m). 
Unfortunately, to attain this length, in general, we have to encode a sequence of AT messages 
(N oo) at a time. If we w ish to encode each message directly without using longer sequences, 
then, in general, the average length of the codeword per message will be greater than H{ m). 
Tn practice, it is not desirable to use long sequences, since they cause transmission delay and 
add to equipment complexity. Hence, it is preferable to encode messages directly, even if the 
price has to be paid in terms of increased word length. In most cases, the price turns out to be 
small. The following is a procedure, given without proof, for finding the optimum source code, 
called the Huffman code. The proof that this code is optimum can be found elsewhere. 4-6 

We shall illustrate the procedure with an example using a binary code. We first arrange 
the messages in the order of descending probability, as shown in Table 13.1. Here we have 
six messages with probabilities 0.30, 0.25, 0.15, 0.12, 0.08, and 0.10, respectively. We now 
aggregate the last two messages into one message with probability P 5 + P$ = 0.18. This 
leaves five messages with probabilities, 0.30, 0.25, 0.18, 0.15, and 0.12. These messages are 
now rearranged in the second column in the order of descending probability. We repeat this 
procedure by aggregating the last two messages in the second column and rearranging them 
in the order of descending probability. This is done until the number of messages is reduced 
to two. These two (reduced) messages are now assigned 0 and 1 as their first digits in the code 
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TABLE 13.1 


Original Source 


Reduced Sources 


Messages 

Probabilities 

Si 

Si 

m ] 

0.30 


0.30 

0.30 

m 2 

0.25 


0.25 

—>0.27 

m 3 

0.15 


i->0.18 

0.25 

m4 

0.12 


0.15 


0.18 

m 5 

0.08 

! i 

0.12 


m 6 

0.10 






TABLE 13.2 


Original Source 


Reduced Sources 


Messages 

Probabilities 

Code 


Si 


S 2 




^4 

m | 

0.30 

00 

0.30 

00 

0.30 

00 

r->0.43 

1 

(->0.57 
_| 0.43 

0 

m-2 

0.25 

10 

0 + 25 

10 

p> 0.27 

01 

0,30 

00 

1 

m 3 

ntn 

0.15 

0,12 

010 | 
Oil 

r^O.18 

0,15 

11 

010 

L 

0.25 

0,18 

up 

0.27 

01 



0.0s 

110 

u 

0.12 

Oil 







0.10 

111, 










sequence. We now go back and assign the numbers 0 and 1 to the second digit for the two 
messages that were aggregated in the previous step. We keep regressing in this way until the 
first column is reached. The code finally obtained (for the first column) can be shown to he 
optimum. The complete procedure is shown in Tables 13.1 and 13.2* 

The optimum (Huffman) code obtained this way is also called a compact code. The 
average length of the compact code in the present case is given by 

n 

L = Y^hL< =0.3(2) + 0.25(2) +0.15(3) + 0.12(3) +0.1 (3) + 0,08(3) 

1=1 

— 2.45 binary digits 

The entropy H{ m) of the source is given by 

= log* 

i=l n 

= 2.418 bits 


Hence, the minimum possible length (attained by an infinitely long sequence of messages) 
is 2.418 binary digits. By using direct coding (the Huffman code), it is possible to attain an 
average length of 2A5 bits in the example given. This is a close approximation of the optimum 
performance attainable. Thus, little is gained by complex coding of a number of messages in 
this case. 
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The merit of any code is measured by its average length in comparison to H{ m) (the 
average minimum length). We define the code efficiency rj as 

H{ m) 


where L is the average length of the code. In our present example, 



2.418 

2.45 

0.976 


The redundancy y is defined as 


y = \ - r} 

= 0.024 

Even though the Huffman code is a variable length code, it is uniquely decodable. If 
we receive a sequence of Huffman-coded messages, it can be decoded only one way, that is, 
without ambiguity. For instance, if the source in this exercise were to emit the message sequence 
,.., it would be encoded as 001101000011010111 — The reader may 
verify that this message sequence can be decoded only one way, viz, 

even if there is no demarcation between individual messages. This uniqueness is assured by 
the special property that no codeword is a prefix of another (longer) codeword. 

A similar procedure is used to find a compact r-ary code. In this case we arrange the 
messages in descending order of probability, combine the last r messages into one message, 
and rearrange the new set (reduced set) in the order of descending probability. We repeat the 
procedure until the final set reduces to r messages. Each of these messages is now assigned 
one of the r numbers 0, 1, 2,..., r - 1. We now regress in exactly the same way as in the 
binary case until each of the original messages has been assigned a code. 

For an r-ary code, we will have exactly r messages left in the last reduced set if, and 
only if, the total number of original messages is r + fc(r — 1), where k is an integer. This 
is because each reduction decreases the number of messages by r — 1. Hence, if there is a 
total of k reductions, the total number of original messages must be r + k(r — 1). In case the 
original messages do not satisfy this condition, we must add some dummy messages with zero 
probability of occurrence until this condition is fulfilled. For example, if r = 4 and the number 
of messages n is 6, then we must add one dummy message with zero probability of occurrence 
to make the total number of messages 7, that is, [4 + 1(4 — 1)J, and proceed as usual. The 
procedure is illustrated in Example 13.1. 


Example 13.1 A memoryless source emits six messages with probabilities 0.3,0.25,0.15,0.12,0.1* and 0.08. 

Find the 4-ary (quaternary) Huffman code. Determine its average word length, the efficiency, 
and the redundancy. 

In this case, we need to add one dummy message to satisfy the required condition of 
r + k(r — 1) messages and proceed as usual. The Huffman code is found in Table 13.3. 
The length L of this code is 

L = 0.3(1)+ 0.25(1)+0.15(1)+0.12(2)+0.1(2)+ 0.08(2)+0(2) 

= 1.3 4-ary digits 
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TABLE 13.3 

Original Source 

Messq 9 es Probabilities Code Reduced Sources 


m\ 

m 2 

m-\ 

m 4 

6 

m 7 


0,30 

0,25 

0,15 

0.12 

0.10 

0.08 

0.00 


0 

2 

3 

10 

11 

12 

13j 


> 


0.30 

0.30 

0,25 

0.15 


0 

1 

2 

3 


Also, 


6 

# 4 ( 111 ) = ~y^p ! ]og 4 F/ 

i=l 

= 1-209 4-ary units 

The code efficiency r\ is given by 


1.209 


= 0.93 


The redundancy y = ] — r} = 0.07. 


To achieve code efficiency rj —> 1, we need AT 00 - The Huffman code uses N = \ , but 
its efficiency is, in general, less than 1. A compromise exists between these two extremes of 
N = 1 and AT = 00 . We can encode a group of N = 2 or 3 messages. In most cases, the use 
ofN = 2 or 3 can yield an efficiency close to L as the following example shows. 


Example 1 3.2 A memoryless source emits messages mj and m 2 with probabilities 0.8 and 0.2, respectively. 

Find the optimum (Huffman) binary code for this source as well as for its second- and third- 
order extensions (i.e., for N = 2 and 3). Determine the code efficiencies in each case. 

The Huffman code for the source is simply 0 and 1, giving L = 1, and 

p 

% H{m) = -(0.8 log 0.8+ 0.2 log 0.2) 

{•: = 0.72 bit 

v. Hence, 

/. n - 0.72 

For the second-order extension of the source (N = 2), there are four possible composite 
messages, m\m\, m\m 2 , mmi, and with probabilities 0.64, 0.16, 0.16, and 0.04, 
respectively. The Huffman code is obtained in Table 13.4. 
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# TA8LE 13.4 


Original Source 


Messages 

Probabilities 

Code 


Reduced Source 


tn i tn i 

0.64 

0 

0.64 

0 

0,64 

0 

A m i m j 

0 T6 

u 

- > 0,20 

10 

i - >0.36 

1 


0.16 

100 


0.16 

11 



7^2^12 

0.04 

101J 






;? TABLE 13.5 


T Messages 

Probabilities 

Code 

H ] m \ m \ 

0.512 

0 

T mi mi m 2 

0.128 

100 


0.128 

101 

j- 

0.128 

no 

mymottij 

0*032 

11100 

\\ mini] 

0.032 

11101 

■T 

0.032 

11110 

]- m2^72 m 2 

0.008 

mu 


i ? 

£ In this case the average word length L f is 

L' = 0.64(1) +0.16(2) + 0,16(3) +0.04(3) 
i — 1.56 


^ This is the word length for two messages of the original source* Hence L y the w ord length 
per message, is 

L = -- = 0.78 
2 

and 


0.72 

0/78 


= 0.923 


If we proceed with N = 3 (the third-order extension of the source), we have eight possible 
messages, and following the Huffman procedure, we find the code as shown in Table 13.5. 
The word length L” is 


L” 


(0.512)1 + <0.128 + 0.128 + 0.128)3 
+ (0.032 + 0.032 + 0.032)5 4- (0.008)5 
2.184 


Then, 


L" 

L= — = 0.728 
3 
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and 


0.72 

0.728 


0.989 


13.3 ERROR-FREE COMMUNICATION OVER A 
NOISY CHANNEL 

As seen in the previous section, messages of a source with entropy //(m) can be encoded by 
using an average of //(in) digits per message. This encoding has zero redundancy. Hence, 
if we transmit these coded messages over a noisy channel, some of the information will be 
received erroneously. There is absolutely no possibility of error-free communication over a 
noisy channel w'hen messages are encoded with zero redundancy. The use of redundancy, in 
general, helps combat noise. This can be seen from a simple example of a single parity check 
code, in W'hieh an extra binary digit is added to each codeword to ensure that the total number 
of Is in the resulting codeword is always even (or odd). If a single error occurs in the received 
codeword, the parity is violated, and the receiver requests retransmission. This is a rather 
simple example to demonstrate the utility of redundancy, More complex coding procedures, 
which can correct up to n digits, will be discussed in the next chapter. 

The addition of an extra digit increases the average word length to H( m) + 1 , giving 
n=H (m)/[//(m) + 1 ], and the redundancy is 1 - t] = t/[//(m) + 1 ], Thus, the addition of 
an extra check digit increases redundancy, but it also helps combat noise. Immunity against 
channel noise can be increased by increasing the redundancy. Shannon has shown that it is 
possible to achieve error-free communication by adding sufficient redundancy. 


Transmission over Binary Symmetric Channels 

We consider a binary symmetric channel (BSC) with an error probability P e , then for error- 
free communication over this channel, messages from a source with entropy H( m) must be 
encoded by binary codes with a word length of at least H(m)/C s . where 


C, 


= 1 - Pe 


logi+d-TWog^ j 


(13.14) 


The parameter C s (C, < 1 ) is called the channel capacity (to be discussed next in Sec, 13,4). 

Because of the intentional addition of redundancy for error protection, the efficiency of 
these codes is always below C s < L If a certain binary channel has C. F = 0.4, a code that 
can achieve error-free communication must have at least 2.5 H(m) binary digits per message, 
which is 2.5 times as many digits as are required for coding without redundancy. This means 
there are L5 //(m) redundant bits per message bit. Thus, on the average, for every 2.5 digits 
transmitted, one digit is the information digit and L5 digits are redundant, or check, digits, 
giving a redundancy of 1 - C s = 0*6. 

As discussed in the beginning of this chapter, P e , the error probability of binary signaling, 
varies as e” k£ * and, hence, to make P e 0, either oc or R b -> 0. Because 5, must be 
finite, P e 0 only if —> 0 . But Shannon’s results state that it is really not necessary to 
let R b 0 for error-free communication over bandwidth B. All that is required is to hold R b 
below C, the channel capacity per second (C = 2BC$)> Where is the discrepancy? To answer 
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Figure 13.1 

Three- 
dimensional 
cube in 

Hamming space. 


001 



Oil 


010 


this question let us investigate carefully the role of redundancy in error-free communication. 
Although the discussion here is with reference to a binary scheme, it is quite general and can 
be extended to the M -ary case. 

Consider a simple method of reducing P e by repeating a given digit an odd number of 
times. For example, we can transmit 0 and 1 as 000 and 111. The receiver uses the majority 
rule to make the decision; that is, if at least two out of three digits are 1 , the decision is 1 , and 
if at least two out of three digits are 0, the decision is 0. Thus, if fewer than two of the three 
digits are in error, the information is received error-free. Similarly, to correct two errors, we 
need five repetitions. In any case, repetitions cause redundancy but improve P e (Example 8 . 8 ), 

It will be instructive to understand the situation just described from a graphic point of 
view. Consider the case of three repetitions. We can show all eight possible sequences of 
three binary digits graphically as the vertices of a cube (Fig. 13.1). It is convenient to map 
binary sequences as shown in Fig, 13T and to talk in terms of what is called the Hamming 
distance between binary sequences. If two binary sequences of the same length differ in j places 
(j digits), then the Hamming distance between the sequences is considered to be j. Thus, the 
Hamming distance between 000 and 010 (or 001 and 101) is 1, and is 3 between 000 and I'll. 
In the case of three repetitions, we transmit binary 1 by 111 and binary 0 by 000. The Hamming 
distance between these two sequences is 3. Observe that of the eight possible vertices, we are 
occupying only two (000 and 111) for transmitted messages. At the receiver, however, because 
of channel noise, we are liable to receive any one of the eight sequences. The majority decision 
rule can be interpreted as a rule that decides in favor of the message (000 or 111) that is at 
the closest Hamming distance to the received sequence. Sequences 000,001,010, and 100 are 
within 1 unit of the Hamming distance from 000 but are at least 2 units away from 111. Hence, 
when we receive any one of these four sequences, our decision is binary 0. Similarly, when 
any one of the sequences 110, 111, Oil, or 101 is received, the decision is binary 1 . 

We can now see why the error probability is reduced in this scheme. Of the eight possible 
vertices, we have used only two, which are separated by 3 Hamming units. If we draw a 
Hamming sphere of unit radius around each of these two vertices (000 and 111), the two 
Hamming spheres* will be nonoverlapping. The channel noise can cause a distance between 
the received sequence and the transmitted sequence, and as long as this distance is equal to 
or less than 1 unit, we can still detect the message without error. In a similar way, the case 
of five repetitions can be represented by a hypercube of five dimensions. The transmitted 
sequences 00000 and 11111 occupy two vertices separated by five units, and the Hamming 


* Note that the Hamming sphere is not a true geometrical hypersphere because the Hamming distance is not a true 
geometrical distance (e.g., sequences 001, 010, and 100 lie on a Hamming sphere centered at 111 and having a 
radius 2). 
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spheres of 2-unit radius drawn around each of these two vertices would be nonoverlapping. 
In this case, even if channel noise causes two errors, we can still detect the message correctly. 
Hence, the reason for the reduction in error probability is that we have not used all the available 
vertices for messages. Had we occupied all the available vertices for messages (as is the case 
without redundancy, or repetition), then if channel noise caused even one error, the received 
sequence would occupy a vertex assigned to another transmitted sequence, and we would 
inevitably make a wrong decision. Precisely because we have left the neighboring vertices 
of the transmitted sequence unoccupied, are we able to detect the sequence correctly, despite 
channel errors within a certain limit The smaller the fraction of vertices used, the smaller the 
error probability. It should also be remembered that redundancy (or repetition) is what makes 
it possible to have unoccupied vertices. 

Repetition Is Inefficient 

If we continue to increase n , the number of repetitions, we will reduce P et but we will also 
reduce R& by the factor n * But no matter how large we make n , the error probability never 
becomes zero. The trouble with this scheme is that it is inefficient because we are adding 
redundant (or check) digits to each information digit. To give an analogy, redundant (or check) 
digits are like guards protecting the information digit. To hire guards for each information digit 
is somewhat similar to a case of families living on a certain street that has been hit by several 
burglaries* Each family panics and hires a guard. This is obviously expensive and inefficient. 
A better solution would be for all the families on the street to hire one guard and share the 
expense* One guard can check on all the houses on the street, assuming a reasonably short 
street. If the street is too long, it might be necessary to hire a team of guards. But it is certainly 
not necessary to hire one guard per house. In using repetitions, we had a similar situation. 
Redundant (or repeated) digits were used to help (or check on) only one message digit. Using 
the clue from the preceding analogy, it might be more efficient if we used redundant digits 
not to check (guard) any one individual transmitted digit but, Tather, a block of digits. Herein 
lies the key to our problem. Let us consider a group of information digits over a certain time 
interval of T seconds, and let us add some redundant digits to check on all these digits* 

Suppose we need to transmit a binary information digits per second* Then over a period 
of T seconds, we have a block of aT binary information digits. If to this block of information 
digits we add (p - a)T check digits (i*e., p — a check digits, or redundant digits, per second), 
then we need to transmit fiT (fi > a) digits for every aT information digits* Therefore over a 
T -second interval, we have 


(xT = information digits 

fiT — total transmitted digits (ft > a) (13.15) 

(ft — a)T — check digits 

Thus, instead of transmitting one binary digit every 1 /a second we let a T digits accumulate 
over T seconds. Now consider this as a message to be transmitted. There are a total of 2* T 
such supermessages* Thus, every T seconds we need to transmit one of the 2 aT possible 
supermessages. These supermessages are transmitted by a sequence of fiT binary digits. There 
are in all 2& T possible sequences of fiT binary digits, and they can be represented as vertices of 
a pT -dimensionat hypercube. Because we have only 2 P J messages to be transmitted, whereas 
2& T vertices are available, we occupy only a 2~^~^ T fraction of the vertices of the pT- 
dimensional hypercube. Observe that we have reduced the transmission rate by a factor of 
cilp, This rate reduction factor a/p is independent of T. The fraction of the vertices occupied 
(occupancy factor) by transmitted messages is and can be made as small as possible 
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simply by increasing T. In the limit as T -> oc, the occupancy factor approaches 0. This will 
make the error probability go to 0, and we have the possibility of error-free communication. 

One important question, however, still remains unanswered. What must be the rate reduc¬ 
tion ratio ajfi for this dream to come true? To answer this question, we observe that increasing 
T increases the length of the transmitted sequence (fiT digits). If P e is the digit error proba¬ 
bility, then it can be seen from the relative frequency definition (or the law of large numbers) 
that as T —> oo, the total number of digits in error in a sequence of fiT digits (fiT -> oo) is 
exactly f$TP e . Hence, the received sequences will be at a Hamming distance of f$TP e from the 
transmitted sequences. Therefore, for error-free communication, we must leave all the vertices 
unoccupied within spheres of radius fiTP e drawn around each of the 2P l occupied vertices. 
In short, we must be able to pack 2 ai nonoverlapping spheres, each of radius fiTP e , into the 
Hamming space of dimensions j87\ This means that for a given a cannot be increased 
beyond some limit without causing overlap in the spheres and the consequent failure of the 
error correction scheme. Shannon's theorem states that for this scheme to work, affi must 
be less than the constant (channel capacity) C f , which physically is a function of the channel 
noise and the signal power; 


~<c,, (13.16) 

P 

It must be remembered that such perfect, error-free communication is not practical. In 
this system we accumulate the information digits for T seconds before encoding them, and 
because T —> oo, for error-free communication we would have to wait until eternity to start 
encoding. Hence, there will be an infinite delay at the transmitter and an additional delay of 
the same amount at the receiver. Second, the equipment needed for the storage, encoding, 
and decoding sequence of infinite digits would be monstrous. Needless to say, the dream of 
error-free communication cannot be achieved in practice. Then wdiat is the use of Shannon’s 
result? For one thing, it indicates the upper limit on the rate of error-free communication that 
can be achieved on a channel. This in itself is monumental Second, it indicates that we can 
reduce the error probability below an arbitrarily small level by allowing only a small reduction 
in the rate of transmission of information digits. We can therefore seek a compromise between 
error-free communication with infinite delay and virtually error-free communication with a 
finite delay. 


13.4 CHANNEL CAPACITY OF A DISCRETE 
MEMORYLESS CHANNEL 

This section treats discrete memoryless channels. Consider a source that generates a message 
that contains r symbols xu X 2 * . ♦>, x r . The receiver receives symbols >q, > 2 , ,.., y s - The set 
of symbols may or may not be identical to the set fy*}, depending on the nature of the 
receiver. If we use receivers of the types discussed in Chapter 10, the set of received symbols 
will be the same as the set transmitted. This is because the optimum receiver, upon receiving a 
signal, decides which of the r symbols x \, * 2 , ,.., x? has been transmitted. Here we shall be 
more general and shall not constrain the set [y k ] to be identical to the set {x k \. 

Tf the channel is noiseless, then the reception of some symbol >y uniquely determines 
the message transmitted. Because of noise, however, there is a certain amount of uncertainty 
regarding the transmitted symbol when yj is received. If P(jq|yj) represents the conditional 
probabilities that xi was transmitted when yj is received, then there is an uncertainty of 
log [l/Pfjq \yj)] about Xj when yj is received. When this uncertainty is averaged over all 
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Xi and yj, we obtain H(x |y), which is the average uncertainty about the transmitted symbol x 
when a symbol y is received. Thus, 


_ I 

H(x\y) = 2_,2_,FCv„ >y)log bits per symbol (13.17) 

For noiseless channels, the uncertainty would be zero.* Obviously, this uncertainty, //(x|y), 
is caused by channel noise. Hence, it is the average loss of information about a transmitted 
symbol when a symbol is received. We call H(x |y) the conditional entropy of x given y (i + e., 
the amount of uncertainty about x once y is known). 

Note that P(>y|jCj) represents the a priori probability that yj is received when X[ is trans¬ 
mitted. This is a characteristic of the channel and the receiver. Thus, a given channel (with its 
receiver) is specified by the channel matrix: 


Inputs 



Outputs 



y 2 

y* 


P(yi 

*i) ■ • 

* ■fO'jki) 

P6l 1*2) 

P{yi 

*2) ■ ■ 

■ P(Vil*2) 

P(y\\x r ) 

P(yi |a>) :: 

: P(y s \x r ) 


v / 


We can use Bayes' rule to obtain the a posteriori (or reverse) conditional probabilities P{x t jyy): 


1 P(yj) 

(13.18a) 

_ P(yj\Xi)P(Xi) 

T,iP(*iO‘j) 

(13.18b) 

P(yj\xi)P0ci) 

Ei p (xi)P(yj\xi) 

(13.18c) 


Thus, if the input symbol probabilities P{xi) and the channel matrix are known, the a posteriori 
conditional probabilities can be computed from Eqs. (13.18). The a posteriori conditional 
probability Pfoljy) is the probability that Xj was transmitted when yj is received. 

For a noise Tree channel, the average amount of information received would be H(x) 
bits (entropy of the source) per received symbol. Note that H(x) is the average information 
transmitted over the channel per symbol. Because of channel noise, even when receiving y we 
still have some uncertainty about x in the average amount of 7/(x|y) bits of information per 
symbol. Therefore, in this transaction of receiving y, the amount of information the receiver 
receives is, on the average, l(x; y) bits per received symbol, where 

/(x;y) = H(x) — H(x jy) bits per symbol (13.19) 


* This can be verified from the fact that for a noiseless channel all the probabilities in Eq. (13.17) are either 0 or 1. 
] f P(*i\yj) = b then iog[1/P(*,-|>y)] = 0 and if = 0 T then P{x it y,) = P(yj)P(xi\yj) = 0. This shows that 

ff(x|y) = 0. 
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/(x; y) is called the mutual information of x and y. Because 

H(x) = Y P{Xi) log — l — bits 

“ P(x<) 


we have 


/(x;y) = £>(*;) log ^ 


J]^FU',Y;)l0g 

* J 


1 

P(Si\yj) 


Also because 


y^P(xi,yj) =P(xi ) 


We have 


/(x;y) 


log ^ 


PMyj) 


EE^y,) io ® 

' j 


P(xj\yj) 

PUi) 


EE P(xi,yj) log 


P(xi,yj) 

P(xi)P(yj) 


(13.20a) 

(13.20b) 


Alternatively, by using Bayes’ rule in Eq. (13.20a), we can express /(x; y) as 


/(x;y) = ^^P(jc,-,y,-)log 
> J 


P(yj\xj) 

P(yj) 


or we may substitute Eq. (13.18c) into Eq. (13.20a): 


(13.20c) 


Equation (13.20d) expresses /(x; y) in terms of the input symbol probabilities and the channel 
matrix. 

The units of /{x; y) should be carefully noted. Since /(x; y) is the average amount of 
information received per symbol transmitted, its units are bits per symbol. If we use binary 
digits at the input, then the symbol is a binary digit, and the units of /(x; y) are bits per binary 
digit. 

Because /(x; y) in Eq. (13.20b) is symmetrical with respect to x and y, it follows that 


/(x;y) = /(y;x) 

= H(y)-H(y\x) 


(13.21a) 

(13.21b) 
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The quantity H( y|x) is the conditional entropy of y given x and is the average uncertainty 
about the received symbol when the transmitted symbol is known. Equation (13.21b) can be 
rewritten as 


H(x) -H(x |y) - H( y) - ff (y|x) (13.21c) 

FromEq. (13 + 20d) it is clear that/(x:y) is a function of the transmitted symbol probabilities 
P(xi) and the channel matrix. For a given channel, /(x;y) will be maximum for some set of 
probabilities PC*;)* This maximum value is the channel capacity C St 


C s = max /(x;y) 

P(X r) 


bits per symbol 


(13.22) 


Thus, because we have allowed the channel input to choose any symbol probabilities P(x,), 
represents the maximum information that can be transmitted by one symbol over the 
channel. These ideas will become clear from the following example of a binary symmetric 
channel (BSC). 


Example 13.3 Find the channel capacity of the BSC shown in Fig. 13.2. 


Figure 13.2 

Binary symmetric 
channel. 



Let^Cvi) = a and P(x 2 ) = a = (1 - a). Also, 

P<y]\* 2 ) = P(y'2\x ]) = Pe 
P^l\Xl) = P(y 2 \x 2 ) = Pc = \ - Pe 

Substitution of these probabilities into Eq. (13.20d) gives 

/(x;y) =aP e log ( - Pe ^ + aP e \og ( - ^ „ - ) 

\aP e +aP e / \ttPe +ceP e / 

P, 




1 


% 

r 


+ &P e log 


-) + 5P,log(- 

\aP e +aP e f \ttPe H-ciP*?/ 


v aP e + a P e 

- (aPe 4- aP e ) log ( 1 _ ) 

\aP e +aP e ) 

+ ( aP e + aP c ) log 


1 


P e log — + P e log 
V P e 


\aPe + aP e ) 

t) 
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Figure 13.3 

Plot of p{z)- 
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If we define 


p(z) — z log - -h z log 3 


with z = 1 - z, then 


I(x’y) = p(aP e + aP e )-p(P<) (13.23) 


The function p(z) vs, z is shown in Fig* 133 It can be seen that p(z) is maximum at 
z — \ - (Note that we are interested in the region 0 < z < I only,) For a given P e , p(P e ) 
is fixed. Hence from Eq, (13*23) it follows that /(x; y) is maximum when p(aP e + ctPe) 
is maximum. This occurs when 


aP e H- aP e = 0.5 


or 


aP, + (1 -a)0 - P<>) — 0.5 


This equation is satisfied when 


« = 0.5 (13.24) 


For this value of a , p(aP e + aP e ) = 1 and 


Ci = max /(x;y) = 1 — p(P e ) 

PM 

= 1- Pelog^- + (1 -P e )log(-l—^ (13.25) 
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Figure 13.4 

Binary symmetric 
channel capacity 
o% a function of 
error 

probability P e . 






§ 

I 


% 

% 


% 

l 


From Fig. 13.4, which shows C s vs. /V it follows that the maximum value of C v is 
unity. This means we can transmit at most 1 bit of information per binary digit. This is the 
expected result, because one binary digitcan convey one of the tw ? o equi probable messages. 
The information content of one of the twoequiprobable messages is logo 2 = 1 bit. Second, 
we observe that C s is maximum when the error probability P e — 0 or P € — 1. When the 
error probability P e = 0, the channel is noiseless, and we expect C* to be maximum. But 
surprisingly, C s is also maximum when P e = L This is easy to explain, because a channel 
that consistently and with certainty makes errors is as good as a noiseless channel. All we 
have to do to have error-free reception is reverse the decision that is made; that is, if 0 is 
received, we decide that 1 was actually sent, and vice versa. The channel capacity C s is 
zero (minimum) when P e = If the error probability is then the transmitted symbols 
and the received symbols are statistically independent. If we received 0, for example, 
either 1 or 0 is equally likely to have been transmitted, and the information received 
is zero. 


Channel Capacity per Second 

The channel capacity C s in Eq. (13.22) gives the maximum possible information transmitted 
when one symbol (digit) is transmitted. If K symbols are being transmitted per second, then the 
maximum rate of transmission of information per second is KC ? . This is the channel capacity 
in information units per seconds and will be denoted by C (in bits per second); 

C = KC, 

A Comment on Channel Capacity: Channel capacity is the property of a partic¬ 
ular physical channel over which the information is transmitted. This is true provided 
the term channel is correctly interpreted. A channel means not only the transmission 
medium, it also includes the specifications of the kind of signals (binary, r-ary, etc., or 
orthogonal, simplex, etc.) and the kind of receiver used (the receiver determines the error 
probability). All these specifications are included in the channel matrix. A channel matrix 
completely specifies a channel. If we decide to use, for example, 4-ary digits instead of 
binary digits over the same physical channel, the channel matrix changes (it becomes a 
4x4 matrix), as does the channel capacity. Similarly, a change in the receiver or the 
signal power or noise power will change the channel matrix and, hence, the channel 
capacity. 























754 INTRODUCTION TO INFORMATION THEORY 


Measuring Channel Capacity 

The channel capacity C s is the maximum value of M(x) - H(x\ y); naturally, C s < ma xH(x) 
[because //(xjy) > 0], But H(x) is the average information per input symbol. Hence, C* 
is always less than (or equal to) the maximum average information per input symbol. If 
we use binary symbols at the input, the maximum value of //(x) is 1 bit, occurring when 
P(x\) = P(x 2 ) = Hence, for a binary channel, C s < 1 bit per binary digit. If we use 
r-ary symbols, the maximum value of H r (x) is 1 r-ary unit. Hence, C s < 1 r-ary unit per 
symbol. 


Verification of Error-Free Communication over a BSC 

We have shown that over a noisy channel, C* bits of information can be transmitted per symbol. 
If we consider a binary channel, this means that for each binary digit (symbol) transmitted, 
the received information is C* bits (C s < 1). Thus, to transmit 1 hit of information, we 
need to transmit at least 1 fC s binary digits. This gives a code efficiency C s and redundancy 
I — C s . Here, the transmission of information means error-free transmission, since 7(x; y) 
was defined as the transmitted information minus the loss of information caused by channel 
noise. 

The problem with this derivation is that it is based on a certain speculative definition of 
information LEq, (13.1)]. And based on this definition, we defined the information lost during 
the transmission over the channel. We really have no direct proof that the information lost over 
the channel will oblige us in this way. Hence, the only way to ensure that this whole speculative 
structure is sound is to verify it. If we can show that C s bits of error-free information can be 
transmitted per symbol over a channel, the verification will be complete. A general case will 
be discussed later. Here we shall verify the results for a BSC, 

Let us consider a binary source emitting messages at a rate of a digits per second. We 
accumulate these information digits over T seconds to give a total of aT digits. Because 
aT digits form 2 aT possible combinations, our problem is now to transmit one of these 
2 aT supermessages every T seconds. These supermessages are transmitted by a code of 
word length fiT digits, with ft > a to ensure redundancy. Because fiT digits can form 
2& t distinct patterns (vertices of a ^T-dimensional hypercube), and we have only 2 aT mes¬ 
sages, we are utilizing only a 2~^~ a ^ r fraction of the vertices. The remaining vertices 
are deliberately unused, to combat noise. If we let T —► 00 , the fraction of vertices used 
approaches 0. Because there are fiT digits in each transmitted sequence, the number of 
digits received in error will be exactly f}TP e when T —> oc. We now construct Hamming 
spheres of radius f$TP € each around the 2 aT vertices used for the messages. When any mes¬ 
sage is transmitted, the received message will be in the Hamming sphere surrounding the 
vertex corresponding to that message. We use the following decision rule: If a received 
sequence falls inside or on a sphere surrounding message m,-, then the decision is “m* is 
transmitted/ 7 If T -> 00 , the decision will be without error if all the 2 aT spheres are 
nonoverlapping. 

Of all the possible sequences of fiT digits, the number of sequences that differ from a 
given sequence by exactly j digits is j (see Example 8.6), Hence, K , the total number of 
sequences that differ from a given sequence by less than or equal to fiTP e digits, is 



(13.26) 
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Here we use an inequality often used in information theory : 4 ’ 7 

$ TP * /AT\ 

£ r. ) < 2^ lp{Pe) p e < 0.5 

j =o ^ 7 ^ 


Hence, 


K < 2* ?7 > (Pb) (13.27) 

with the definition that 

p{P e ) = P e log-2 + (1 - Pe) log 1 — 

' f i — * e 

From the 2$ T possible vertices we choose 2 aT vertices to be assigned to the supermessages. 
How shall we select these vertices? From the decision procedure it is clear that if we assign 
a particular vertex to a supermessage, then none of the other vertices lying within a sphere 
of radius PTP e can be assigned to another supermessage. Thus, when we choose a vertex for 
mi, the corresponding K vertices [Eq + (13.26)] become ineligible for consideration. We must 
choose, from the remaining 2& T - K vertices, another vertex for m 2 . We proceed in this way 
until all the 2& r vertices have been exhausted. This is a rather tedious procedure. Let us see 
what happens if we choose the required l aT vertices randomly from the 2$ T vertices. Tn this 
procedure there is the danger of selecting more than one vertex lying within a distance f$TP e . 
If, however, a/p is sufficiently small, the probability of making such a choice is extremely 
small as T 00 ♦ The probability of choosing any particular vertex s\ as one of the 2 aT 

vertices from 2^ r vertices is 2 aT /2& r = 2~ { P~ a}T t 

Remembering that K vertices lie within a distance of pTP e digits from si, the probability 
that we may also choose another vertex that is within the distance pTP e from each of these 
K vertices (that form the Hamming sphere around s 1 ) is 

p — 


p < 2 - h ? n-.p(A)l-Gd7’ 




% < 1 -p(Pe) (13 + 28a) 

P 

But 1 - p(P e ) is C St the channel capacity of a BSC [Eq. (13.25)]. Therefore, 

7 < C s (13 + 28b) 

P 

Hence, the probability of choosing two sequences randomly within a distance PTP e approaches 
0 as T 00 provided a/p < C.„ and we have error-free communication. We can choose 
a/p = C s — c, where € is arbitrarily small. 


From Eq. (13,27) it follows that 


Hence, as T 00 , P 0 if 


that is, if 
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13.5 CHANNEL CAPACITY OF A CONTINUOUS 
MEMORYLESS CHANNEL 

For a discrete random variable x taking on values xi, x?, ..*, x H with probabilities Ffxi), 
P(xz), . * *, P{x„), the entropy H(\) was defined as 

n 

H(x) = -£>(*;) log/>(.*;) (13.29) 

f=l 

For analog data, we have to deal with continuous random variables* Therefore, we must extend 
the definition of entropy to continuous random variables* One is tempted to state that //(x) for 
continuous random variables is obtained by using the integral instead of discrete summation 
in Eq* (13*29)*: 


I p(x)log——dx (13.30) 

J -oc pM 

We shall see that Eq* (13.30) is indeed the meaningful definition of entropy for a continu- 
ous random variable. We cannot accept this definition, however, unless we show that it has 
the meaningful interpretation as uncertainty. A random variable x takes a value in the range 
(hAx, («+ l)Ax) with probability p{rtAx) Ax in the limit as Ax —> 0. The error in the approx¬ 
imation will vanish in the limit as Ax 0. Hence //(x), the entropy of a continuous random 
variable x, is given by 


H(x) = lim y p(nAx) Axlog 


1 


pin Ax) Ax 


— lim 


^Pp(«Ax) Ax log 
. n 


1 

pin Ax) 


^^/?(^Ax) Ax log Ax 

n 



p(x) log —- dx ■ 
P(x) 


p(x) log dx 
p(x) 


/ CO 

p(x) dx 
-00 


lim log Ax 


(13.31) 


In the limit as Ax —* 0, log Ax — 00 * It therefore appears that the entropy of a continuous 
random variable is infinite. This is quite true. The magnitude of uncertainty associated with 
a continuous random variable is infinite. This fact is also apparent intuitively* A continuous 
random variable assumes an uncountable infinite number of values, and, hence, the uncertainty 
is on the order of infinity. Does this mean that there is no meaningful definition of entropy for 
a continuous random variable? On the contrary, we shall see that the first term in Eq. (13*31) 
serves as a meaningful measure of the entropy (average information) of a continuous random 
variable x. This may be argued as follows. We can consider f p(x) log [l/p(x)J dx as a relative 
entropy with — log Ax serving as a datum, or reference. The information transmitted over a 
channel is actually the difference between the two terms H(x) and //(x|y)* Obviously, if we 
have a common datum for both H{x) and tf(xjy), the difference H(x) -H(x\y) will be the same 


* Throughout this discussion, the PDF p x (x) will be abbreviated as p(x); this practice causes no ambiguity and 
improves the clarity of the equations. 
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as the difference between their relative entropies. We are therefore justified in considering the 
first term in Eq. (13.31) as the differentia] entropy of x. We must, however, always remember 
that this is a relative entropy and not the absolute entropy. Failure to realize this subtle point 
generates many apparent fallacies, one of which will be given in Example 134. 

Based on this argument, we define H(x ), the differential entropy of a continuous random 
variable x, as 


(1332a) 

(1332b) 


Although //(x) is the differential (relative) entropy of x, we shall call it the entropy of random 
variable x for brevity. 

Example 1 3.4 A signal amplitude x is a random variable uniformly distributed in the range (— 1, l).This signal 
is passed through an amplifier of gain 2. The output y is also a random variable, uniformly 
distributed in the range (-2,2). Determine the (differential) entropies H(x) and H{ y). 

We have 

M < l 

otherwise 

l>1 < 2 

otherwise 

Hence, 

1 

H(x) = / - log 2 dx = 1 bit 

f 2 1 

Hiy) = I — log 4 dOc = 2 bits 
J-2 4 

The entropy of the random variable y is 1 bit higher than that of x. This result may come 
as a surprise, since a knowledge of x uniquely determines y, and vice versa, because y 
= 2x. Hence, the average uncertainty of x and y should be identical. Amplification itself 
can neither add nor subtract information. Why, then, is H( y) twice as large as M(x )? This 
becomes clear when we remember that H(x) and H{ y) are differential (relative) entropies, 
and they will be equal if and only if their datum (or reference) entropies are equal. The 
reference entropy R\ for x is — log Ax, and the reference entropy R 2 for y is - log Ay 



f 00 1 

(x) = / /?(*) log- dx bits 

J-00 pM 

/ CO 

p(x) log p(x) dx bits 

-DO 
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"i 


(in the limit as Ajc ? Ay 0)* 


R\ = lim — log Ax 

Aj —>0 

/?2 = lim — logAv 

Ay —> 0 


and 


Ri - R 2 


lim 

Ajc,Av- 



= log 



— log 2=1 bit 


Thus, R\, the reference entropy of x, is higher than the reference entropy Rt for y. Hence, 
if x and y have equal absolute entropies, their differential (relative) entropies must differ 
by 1 bit. 


Maximum Entropy for a Given Mean Square Value of x 

For discrete random variables, we observed that entropy was maximum when all the outcomes 
(messages) were equally likely (uniform probability distribution). For continuous random 
variables, there also exists a PDF/?(jr) that maximizes H(x) in Eqs. (13.32). In the case of a 
continuous distribution, however, we may have additional constraints on x. Either the maximum 
value of x or the mean square valueof x may be given. We shall find here the PDF p(x) that 
will yield maximum entropy when x 2 is given to be a constant a 2 . The problem, then, is to 
maximize//(x): 


with the constraints 


tf(x) = 



1 

p(x) log —- dx 
pM 



dx = 1 



x 2 p(x) dx = a 1 


(13.33) 


(13.34a) 

(13.34b) 


To solve this problem, we use a theorem from the calculus of variation. Given the integral /, 


/ = f F(x t p)dx 


( 13 . 35 ) 
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subject to the following constraints; 


l 

L 

L 


b 


b 


b 


(p}{x,p) dx = X\ 


ip2(x,p)dx = A, 2 


<p k (x y p)dx = X k 


(13*36) 


where Ai, X 2 , . .., X k are given constants. The result from the calculus of variation states that 
the form of/;(x) that maximizes / in Eq. (13.35) with the constraints in Eq. (13.36) is found 
from the solution of the equation 


Bp dp 


Bp 


"A 


3p 


(13.37) 


The quantities eq, are adjustable constants, called undetermined multipliers, 

which can be found by substituting the solution of p(x) [obtained from Eq. (13.37)] in 
Eq. (13.36). In the present case. 


F(p,x) =p log - 

P 

<p\(x*p)=p 

<p2(x,p) =x 2 p 


Hence, the solution for p is given by 


3 

3p 



+ &\ + Qt2~jrp — 0 

Bp 


or 


“(1 + log p) + Of] +1*2* 2 = 0 


Solving for p , we have 


p = e (ai - [) e aixl 


Substituting Eq. (13.38) into Eq. (13,34a), we have 


/OO 

■DO 


1 = / e a ^ l e a ^dx 

-DO 

<*OG 


2 e' 

■ 2 e ai ~' 


e a * x dx 


.of I ~1 f 

(ra 


(13.38) 
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provided of 2 is negative, or 




1 


V JT 


Next we substitute Eqs, (13.38) and (13.39) into Eq, (13,34b): 


(13.39) 


2 = f 00 x \[^i e ^ 2 dx 
J-™ \ * 

- 2 ./=^ r^dx 

V 7 t Jo 


la.2 


or 


and 




Substituting Eqs, (13,40) into Eq. (13.38), we have 


PU) = 




-X*f2cr £ 


(J3 + 40a) 


(13.40b) 


(13,41) 


We therefore conclude that for a given mean square value, the maximum entropy (or maximum 
uncertainty) is obtained when the distribution of x is Gaussian, This maximum entropy, or 
uncertainty, is given by 


Note that 


Hence, 


" w -£ 


P(x) log 2 —- dx 
> p(x) 


log -3- = log (ylno 1 e x ~ !2ul \ 
p(x) V / 


1 X* 

- - log (2^cr“) 4- ~2 lo % e 


H(x) 


r ri 

= PM - 

J — 30 L ^ 


2 log log e dx 


(13,42a) 


= -log(2 J Tcr 2 ) 


/: 


p(x) dx -h 


loge 

2(7 2 


/: 


:r/?Cv) dx 
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= ^log{2rro- 2 ) + ^tr 2 
= ^ log (2jre<r 2 ) 

= ^logdT.lcr 2 ) 


(13.42b) 

(13.42c) 


To reiterate, for a given mean square value x 2 , the entropy is maximum for a Gaussian 
distribution, and the corresponding entropy is j log (2jrt'cr 2 ). 

The reader can similarly show (Prob. 13-5.1) that if x is constrained to some peak value 
M {—M < x < Af), then the entropy is maximum when x is uniformly distributed: 


p{x) = 


1 

2M 


0 


—M < x < M 


otherwise 


Entropy of a Band-Limited White Gaussian Noise 

Consider a band-limited white Gaussian noise n(t) with power spectral density (PSD) A 72, 
Because 


Ri i(t) = A f B sine (2izBz) 

we know that sine {2ttBz) is zero at r = ±k/2B (k integer). Therefore, 

* n ( 4) =0 * = 

Hence, 


«n(^)=n W ”(, + A)=0 k = ±1, ±2, 

Because n(r) and n{t + k/2B) (k = ±1, ±2, . *,) are Nyquist samples of n(t), it follows that all 
Nyquist samples of n(t) are uncorrelated. Because n(f) is Gaussian, uncorrelatedness implies 
independence. Hence, all Nyquist samples of n(r) are independent. Note that 

n 2 — /? n (0) ~ Aj'B 

Hence, the variance of each Nyquist sample is A r B. From Eq. (13.42b) it follows that the 
entropy H(n) of each Nyquist sample of n(f) is 

H( n) = - log (IjreAfB) bits per sample (13.43a) 

Because n(0 is completely specified by 28 Nyquist samples per second, the entropy per sec¬ 
ond of n(0 is the entropy of 2 B Nyquist samples. Because all the samples are independent. 
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knowledge of one sample gives no information about any other sample. Hence, the entropy of 
2 B Nyquist samples is the sum of the entropies of the 2 B samples, and 

// (n) = B log (2nej\iB) bit/s (13.43b) 

where H } (n) is the entropy per second of n(/). 

From the results derived thus far, we can draw one significant conclusion. Among all 
signals band-limited to B Hz and constrained to have a certain mean square value a 2 , the 
white Gaussian band-limited signal has the largest entropy per second. To understand the 
reason for this, recall that for a given mean square value, Gaussian samples have the largest 
entropy; moreover, all the 2 B samples of a Gaussian band-limited process are independent. 
Hence, the entropy per second is the sum of the entropies of all the 2 B samples. In pro¬ 
cesses that are not white, the Nyquist samples are correlated, and, hence, the entropy per 
second is less than the sum of the entropies of the 2 B samples. If the signal is not Gaussian, 
then its samples are not Gaussian, and, hence, the entropy per sample is also less than the 
maximum possible entropy for a given mean square value. To reiterate, for a class of band- 
limited signals constrained to a certain mean square value, the white Gaussian signal has 
the largest entropy per second, or the largest amount of uncertainty. This is also the reason 
why white Gaussian noise is the worst possible noise in terms of interference with signal 
transmission. 

Mutual Information /(x; y) 

The ultimate test of any concept is its usefulness. We shall now show that the relative entropy 
defined in Eqs. (1332) does lead to meaningful results when we consider /(x; y), the mutual 
information of continuous random variables x and y. We wish to transmit a random variable 
x over a channel. Each value of x in a given continuous range is now a message that may be 
transmitted, for example, as a pulse of height x. The message recovered by the receiver will 
be a continuous random variable y. If the channel were noise free, the received value y would 
uniquely determine the transmitted value x. But channel noise introduces a certain uncertainty 
about the true value of x. Consider the event that at the transmitter, a value of x in the interval 
(x, x + Ax) has been transmitted (Ax —>■ 0). The probability of this event is p(x) Ax in the 
limit Ax 0. Hence, the amount of information transmitted is log [l/p(x)Ax] + Let the value 
of y at the receiver be y and let/j(x|y) be the conditional probability density of x when y = y. 
Thenp(x|y)Axis the probability that x will lie in the interval (x,x + Ax) when y = y (provided 
Ax —>■ 0). Obviously, there is an uncertainty about the event that x lies in the interval (x, 
x + Ax). This uncertainty, log [l//?(x|y) Ax], arises because of channel noise and therefore 
represents a loss of information. Because log [l/p(x)Ax] is the information transmitted and 
log [l//j(x|y) Ax] is the information lost over the channel, the net information received is /(x;y) 
given by 


/(x;y) = log 


1 


|_p(x)AxJ 


- log 


1 


lp(x/y)Ax 


= log 


p(s\y) 

P(x) 


(13.44) 


Note that this relation is true in the limit Ax —^ 0. Therefore, /(x;y), represents the infor¬ 
mation transmitted over a channel if we receive y (y — y) when x is transmitted (x = x). We 
are interested in finding the average information transmitted over a channel when some x is 
transmitted and a certain y is received. We must therefore average I(x\ y) over all values of 
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x and y. The average information transmitted will be denoted by 7(x; y), where 

/ OO pOO 

I p{x,y)I{x\y)dxdy 

-oc J-oo 
f oc- r oo 




-00 t/-OG 
*00 /‘OO 


-oo -oo 

-■30 p OO 


/ OO poo | j ™ OO p oo 

I p(x,y) log —— dxdy+ i f p(x,y) log p(x\y) dx dy 

-oo 7— oo PW J-rc 


(13.45a) 

(13.45b) 


-oc J — oc 


/ OO y' CO | p OC p 00 

I P{x)p(y\x) log —— dx dy + j I p(x,y) log p(x\y)dx dy 

OO 7-00 P W 7-00 7-00 


OO J — OO K 

oo ] /'OO V*OC /'OO 


/ OO ] poo p OC I' 

ptolog —— dx p(y\x)dy+ I / 

-oo 7H A 7 7-00 7—OO 7-00 


p(x,y) ]ogp(x\y)dxdy 


Note that 


Hence, 


/ oo 

p(y\x)dy = 1 

-(XJ 


and 


/ p{x) log ™dx = H{x) 
7-oo p(x) 


/ oc |»0C 

I p(x,y ) logp(x|y)dx dy 

■ OO 7 —OO 

/ OO /*00 2 

/ ptx,y)log—--dJtflfy 

■to J—oo p(x\y) 


H(x) 


(13.46a) 

(13,46b) 


The integral on the right-hand side is the average over x and y of log [l/p(jt|y)]. But 
log [l/p(x|y)] represents the uncertainty about x when y is received. This, as we have seen, 
is the information lost over the channel. The average of log [l/p(x|y)] is the average loss of 
information when some x is transmitted and some y is received. This, by definition, is H (x|y), 
the conditional (differential) entropy of x given y, 

/ oo poo | 

I p(x,y) log dx dy (13.47) 

■(55 J-00 P(*lx> 


Hence, 


/(x;y) = H(x.) - H (x|y) (13.48) 

Thus, when some value of x is transmitted and some value of y is received, the average 
information transmitted over the channel is /(x;y), given by Eq. (13.48). We can define the 
channel capacity C, as the maximum amount of information that can be transmitted, on the 
average, per sample or per value transmitted: 


C s = max/(x; y) 


(13.49) 
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For a given channel, 7(x; y) is a function of the input probability density p{x) alone. This can 
be shown as follows: 


p(x,y) = p(x)p(y\x) 

p(-xjy) _ p(.vk) 

p(x) p(y) 

_ p(y\x) 

p{x,y) dx 

_ p(yU ) 

,/X^ p(x)p(y\x) dx 

Substituting Eqs. (13.50) and (13.51) into Eq, (13,45b), we obtain 

/(x; y)=/ f p(x)p(y\x) log ( P ^ X) ) dx dy (13.52) 

J-ooJ-o c \J- nc P( x )P'y\ x >dx/ 

The conditional probability density p(y|x) is characteristic of a given channel. Hence, for a 
given channel specified by p(y|x), /(x; y) is a function of the input probability density p(x) 
alone. Thus, 


(13.50) 


(13,51) 


C 5 = max /(x;y) 

P(X) 

If the channel allows the transmission of K values per second, then C, the channel capacity 
per second, is given by 


C = KQ bit/s (13.53) 

Just as in the case of discrete variables, /(x;y) is symmetrical with respect to x and y for 
continuous random variables. This can be seen by rewriting Eq. (13.45b) as 

/(x; y) = f f p(x, y) log P ^ X ’ } \ dxdy (13.54) 

J-tx, J -00 p(x)p(y) 

This equation shows that 7(x; y) is symmetrical with respect to x and y. Hence, 

/(x;y)=/(y;x) 


FromEq. (13.48) it now follows that 

/(x;y) = H(x) - H(x\ y) - H( y) - H(y\x) (13.55) 

Capacity of a Band-Limited AWGN Channel 

The channel capacity C is, by definition, the maximum rate of information transmission over 
a channel. The mutual information /(x;y) is given by Eq. (13.55): 

I(x;y) = H(y)-H(y\x) (13.56) 

The channel capacity C is the maximum value of the mutual information /(x;y) per second. 
Let us first find the maximum value of /(x;y) per sample. We shall find here the capacity 
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of a channel band-limited to B Hz and disturbed by a white Gaussian noise of PSD ,V/2. In 
addition, we shall constrain the signal power (or its mean square value) to S. The disturbance 
is assumed to be additive; that is, the received signal y (t) is given by 

y(0 =x(r) + n(f) (1.3*57) 

Because the channel is band-limited, both the signal x(/) and the noise n(/) are band-limited 
to B Hz. Obviously, y (t) is also band-limited to B Hz, All these signals ean therefore be 
completely specified by samples taken at the uniform rate of 2B samples per second. Let us 
find the maximum information that can be transmitted per sample. Let x, n, and y represent 
samples of x(f), n(£), and y(r), respectively. The information 7(x: y) transmitted per sample is 
given by Eq. (13*56); 


J(x;y) — H (y) — 7/(v|x) 

We shall now find 77(y|x). By definition [Eq, (13.47)1, 


H(y\ 


-/:/ 




p{x,y) log —— dx dy 

OC — OO 7>(>W 

roc f.oo ! 

/ p(x) dx / p(y |x) log - ■ dy 
J- oo J- no p(y \x) 


Because 


y = x + n 

for a given x t y is equal to n plus a constant (x). Hence, the distribution of y when x has a given 
value is identical to that of n except for a translation by x. If p n ( ) represents the PDF of noise 
sample n, then 

p{y\x) =/j n (y-x) (13,58) 

1 1 

/ p(y\x) log —— dy = / p„(y - x ) log —-- dy 

J-oo p(y\x) J-k p n iy-x) 

Letting y — a = r. we have 

fOC l f 00 l 

/ p(y\x) log ——- dy = / pnWlog —~dz 

J-eo P(y\x ) J-k p„(z) 

The right-hand side is the entropy H{ n) of the noise sample n. Hence, 

W(y|x) = //(n) f p(x)d.x 

J-oo 

= H( n) (13.59) 

In deriving Eq. (13.59), we made no assumptions about the noise. Hence, Eq. (13.59) is very 
general and applies to all types of noise. The only condition is that the noise disturb the channel 
in an additive fashion. Thus, 


7{x;y) — 77(y) — H(n) bits per sample 


(13.60) 
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We have assumed that the mean square value of the signal x(r) is constrained to have a value 
S, and the mean square value of the noise is N. We shall also assume that the signal x(r) and 
the noise n(/) are independent. In such a case, the mean square value of y will be the sum of 
the mean square values of x and n. Hence, 

For a given noise [given H( n)], /(x;y) is maximum when H( y) is maximum. We have seen 
that for a given mean square value of y (y 2 = S + N), H(y) will be maximum if y is Gaussian, 
and the maximum entropy y) is then given by 

»m«(y>= i log [2^(5+W)] (13.61) 


Because 


y = x + n 

and n is Gaussian, y will be Gaussian only if x is Gaussian. As the mean square value of x is 
S, this implies that 


and 


P(X) 


s/2jtS 


-x 2 /2S 


W(x;y) = //max(y) - H(n) 

= i log [27ie(S + AO] — W(n) 
For a white Gaussian noise with mean square value TV, 

H( n) = ^ log 2 xeN N = MB 

and 


C., = /max(x;y) = ~ log ^ ^ j (13.62a) 

= ^ l0g ( 1 + f ) (13 ' 62b) 


The channel capacity per second will be the maximum information that can be transmitted 
per second. Equations (13.62) represent the maximum information transmitted per sample. 
If all the samples are statistically independent, the total information transmitted per second 
will be 2 B times C s . If the samples are not independent, then the total information will be 
less than 2 BC^ Because the channel capacity C represents the maximum possible information 
transmitted per second, 


C = 2 S [l 1 og( 1 + i) 


= 5 log 



bit/s 


(13.63) 
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The samples of a band-limited Gaussian signal are independent if and only if the signal 
power spectra] density (PSD) is uniform over the band (Example 9.2 and Prob. 9.2-3). Obvi¬ 
ously, to transmit information at the maximum rate [Eq. (13.63)J, the PSD of signal y(f) must 
be uniform. The PSD of y is given by 

Sy(f)=Sx{j')+S„(f) 

Because S n (f) = Af / 2, the PSD of x(f) must also be uniform. Thus, the maximum rate of 
transmission (C bit/s) is attained when x(f) is also a white Gaussian signal. 

To recapitulate, when the channel noise is additive, white, and Gaussian with mean square 
value N (N = A' B) t the channel capacity C of a band-limited channel under the constraint of 
a given signal power S is given by 


f) bi,/s 

where B is the channel bandwidth in hertz. The maximum rate of transmission (C bit/s) can 
be realized only if the input signal is a white Gaussian signal. 

Capacity of a Channel of Infinite Bandwidth 

Superficially, Eq. (13,63) seems to indicate that the channel capacity goes tooc as the channel's 
bandwidth B goes to oo. This, however, is not true. For white noise, the noise power N = AfB. 
Hence, as B increases, N also increases. It can be shown that in the limit as B oo, C 
approaches a limit; 


C = Blog 1 + 


C = B log 
-B log 
lim C — lim 

B— oo B-* oo 

= lim 


This limit can be found by noting that 

lim x\og 2 

x^-oo 



Hence, 


lim C = 1.44— bit/s 

A 


(13.64) 


Thus, for a white Gaussian channel noise, the channel capacity C approaches a limit of 1 A4S/Af 
B —■► oo. The variation of C with B is shown in Fig. 13.5. It is evident that the capacity can 
be made infinite only by increasing the signal power S to infinity. For finite signal and noise 
powers, the channel capacity always remains finite. 
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Figure 13-5 

Channel cap¬ 
acity vs, 

bandwidth for a 

channel with 
white Gaussian 
noise and fixed 
signal power. 





Figure 13.6 

(a) Signal space 
representation of 
transmitted and 
received signals 
and noise signal. 

(b) Choice of 
signals for 
error-free 
communication. 



(a) 



Verification of Error-Free Communication over 
a Continuous Channel 

Using the concepts of information theory, wc have shown that it is possible to transmit error- 
free information at a rate of B log 2 (1 + SfN) bit/s over a channel band-limited to B Hz. The 
signal power is S. and the channel noise is white Gaussian with power AT. This theorem can be 
verified in a way similar to that used for the verification of the channel capacity of a discrete 
case. This verification using signal space is so general that it is in reality an alternate proof of 
the capacity theorem. 

Let us consider M -ary communication with M equiprobable messages m \, m 2 , .,«, mu 
transmitted by signals *2(0, ■ ■ ■, J'a/ (0- All signals are time-limited with duration T and 
have an essential bandwidth B Hz. Their powers are less than or equal to The channel is 
band-limited to B , and the channel noise is white Gaussian with power AT 

All the signals and noise waveforms have 2BT + 1 dimensions. In the limit we shall let 
T -> 00 , Hence 2 BT '$> L and the number of dimensions will he taken as 2 BT in our future 
discussion. Because the noise power is A, the energy of the noise waveform of T-second 
duration is NT. Given signal power 5\ the maximum signal energy is ST. Because signals and 
noise are independent, the maximum received energy is (S + N)T. Hence, all the received 
signals will he in a 2£7-dimensional hypersphere of radius +J(S + N)T (Fig. 13.6a), A typical 
received signal *;(0 + n(f) has an energy (Si + N)T t and the point r representing this signal 
lies at a distance of V (Si + N)T from the origin (Fig. 13.6a). The signal vector S[, the noise 
vector n, and the received vector r are shown in Fig. 13.6a. Because 


Is; I = Ar. 


|n| = Vat. |r| = n/(Si + N)T 


(13.65) 
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it follows that vectors a-/, «, and r form a right triangle. Also, n lies on the sphere of radius 
<\ZnT , centered aty,, Note that because n is random, it can lie anywhere on the sphere centered 
at si* 

We have M possible transmitted vectors located inside the big sphere. For each possible 
,y, we can draw a sphere of radius y ''"NT around S. If a received vector r lies on one of the small 
spheres, the center of that sphere is the transmitted waveform. If we pack the big sphere with 
M nonoverlapping and nontouching spheres, each of radius v^Vf (Fig, 13,6b), and use the 
centers of these M spheres for the transmitted waveforms, we will be able to detect all these 
M waveforms correctly at the receiver simply by using the maximum likelihood receiver 
The maximum likelihood receiver looks at the received signal point r and decides that the 
transmitted signal is that one of the M possible tr ansmitted points that is closest to r (smallest 
error vector). Every received point r will lie on the surface of one of the M nonoverlapping 
spheres, and using the maximum likelihood criterion, the transmitted signal will be chosen 
correctly as the point lying at the center of the sphere on which r lies. 

Hence, our task is to find out how r many such nonoverlapping small spheres can be packed 
into the big sphere. To compute this number, we must determine the volume of a sphere of D 
dimensions. 

Volume of a D-Dimensional Sphere 

A D-dimensional sphere is described by the equation 

x l "h x 2 "h ' ’ ' “I" = ^ 

where R is the radius of the sphere. We can show that the volume V(tf) of a sphere of radius 
R is given by 


=R d V(\) (13.66) 

where V(l) is the volume of a D-dimensional sphere of unit radius and, thus, is constant. To 
prove this, we have by definition 


v< ">- /..■/ 


eh] dx 2 >'' dx D 


Letting y ; = xj/R, we have 


V(R) 


/■■■/ 

yt +v*H- hyL 


dy i dy 2 - ■ ■ dy D 

i 


= R D V{\) 


Hence, the ratio of the volumes of two spheres of radii R and R is 

A. / \ D 

v(R) = m 

V(R) \r) 


* Because N is the average noise power, the energy over an interval T is NT + e, where € 0 as T oc. Hence, 

we can assume that n lies on the sphere. 
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As direct consequence of this result, when D is large, almost all of the volume of the sphere 
is concentrated at the surface. This is because if R/R < 1, then ( R/R) D —► 0 as D -> oo. 
This ratio approaches zero even if R differs from R by a very small amount A (Fig. 13*7). This 
means that no matter how small A is, the volume within radius R is a negligible fraction of 
the total volume within radius R if D is large enough. Hence, for a large D , almost all of the 
volume of a D-dimensional sphere is concentrated at the surface. Such a result sounds strange, 
but a little reflection will show that it is reasonable. This is because the volume is proportional 
to the Z)th power of the radius. Thus, for large D, a small increase in flcan increase the volume 
tremendously, and all the increase comes from a tiny increase in R near the surface of the 
sphere. This means that most of the volume must be concentrated at the surface. 

The number of nonoverlapping spheres of radius «JnT that can be packed into a sphere 
of radius A f(S + N)T is bounded by the ratio of the volume of the signal sphere to the volume 
of the noise sphere. Hence, 


M < 


[^/(S + N)T] 2BT V(l) 




(13.67) 


Each of the M -ary signals carries the information of log 2 M binary digits. Hence, the trans¬ 
mission of one of the M signals every T seconds is equivalent to the information rate C 
given by 


c= <BIog(l + ^ bit/s (13.68) 

This equation gives the upper limit of C. 

To show that we can actually receive error-free information at a rate of £ log (1 -T 5/AO, 
we use the argument proposed by Shannon. s Instead of choosing the M transmitted messages 
at the centers of non overlapping spheres (Fig. 13.6b), Shannon proposed selecting the M 
points randomly located in the signal sphere / 5 of radius «/ST (Fig. 13.8). Consider one 
particular transmitted signal s k . Because the signal energy is assumed to be < 5, point 
will lie somewhere inside the signal sphere I s of radius VST* Because all the M signals are 
picked randomly from this sphere, the probability of finding a signal within a volume AV is 
min(l, M AV/V s ) t where V s is the volume of I s . But because for large D all of the volume 
of the sphere is concentrated at the surface, all M signal points selected randomly would lie 
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Figure 13*8 

Derivation of 

channel 

capacity. 



near the surface ot I s . Figure 13.8 shows the transmitted signal the received signal r, and 
the noise n. We draw a sphere of radius VNT with r as the center. This sphere intersects the 
sphere I s and forms a common lens-shaped region. The signal s k lies on the surface of both 
spheres. We shall use a maximum likelihood receiver. This means that when r is received, 
we shall make the decision that “s k was transmitted,” provided none of the remaining M — 1 
signal points are closer to r than s k . The probability of finding any one signal in the lens is 
^lens/V,. Hence P e , the error probability in the detection of s* when r is received, is 

Pe = (M - I) — 

V, 

<M^ 

V s 

From Fig. 13.8, we observe that V ]ens < V(h), where V(h) is the volume of the D- 
dimensional sphere of radius h. Because r, s; : . and n form a right triangle, 


hy/{S+N)T = J(ST)(NT) and h = J 

V S + N 


Hence, 


Also, 


V(h) = 



V(l) 


V, = (SO^Vd) 


and 


Pe<M 



BT 


If we choose 
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then 


P* < [*] 


BT 


If we let k = 1 — A, where A is a positive number chosen as small as we wish, then 

0 as BT —► oc 


This means that P e can be made arbitrarily small by increasing T, provided M is chosen 
arbitrarily dose to (1 + S/N) BT . Thus, 



bit/s 


(13.69) 


where t is a positive number chosen as small as we please. This leads to k — 2~ € ' r and proves 
the desired result. A more rigorous derivation of this result can be found in Wozencraft and 
Jacobs. 9 

Because the M signals are selected randomly from the signal space, they tend to acquire 
the statistics of white noise® (i.e., a white Gaussian random process). 


Comments on Channel Capacity 

According to the result derived in this chapter, theoretically we can communicate error-free 
up to C bit/s. There are, however, practical difficulties in achieving this rate. In proving the 
capacity formula, we assumed that communication is effected by signals of duration T. This 
means we must wait T seconds to accumulate the input data and then encode it by one of the 
waveforms of duration T. Because the capacity rate is achieved only in the limit as T —> oo, 
we have a long wait at the receiver to get the information. Moreover, because the number of 
possible messages that can be transmitted over interval T increases exponentially with 7, the 
transmitter and receiver structures increase in complexity beyond imagination as T oo* 

The channel capacity indicated by Shannon's equation LEq. (13,69)] is the maximum error- 
free communication rate achievable on an optimum system without any restrictions (except 
for bandwidth B , signal power S, and Gaussian white channel noise power N ). If we have 
any other restrictions, this maximum rate will not be achieved For example, if we consider a 
binary channel (a channel restricted to transmit only binary signals), we will not be able to attain 
Shannon's rate, even if the channel is optimum. In Sec. 13,9, MATLAB Computer Exercise 
13.2 supplies numerical confirmation. The channel capacity formula [Eq. (13.63)] indicates 
that the transmission rate is a mo notonic ally increasing function of the signal power 5. If we 
use a binary channel, however, we will find increasing the transmitted power beyond a certain 
point buys very little advantage. Hence, on a binary channel, increasing S will not increase 
the error-free communication rate beyond some value. This does not mean that the channel 
capacity formula has failed. It simply means that when we have a large amount of power (with 
a finite bandwidth) available, the binary scheme is not the optimum communication scheme. 

One last comment: Shannon’s results tell us the upper theoretical limit of error-free com¬ 
munication. But they do not tell us precisely how r this can be achieved. To quote the words of 
Abramson, written in 1963; ht [This is one of the problems] which has persisted to mock infor¬ 
mation theorists since Shannon’s original paper in 1948, Despite an enormous amount of effort 
spent since that time in quest of this Holy Grail of information theory, a deterministic method 
of generating the codes promised by Shannon is still to be found." 4 Amazingly, 30 years later, 
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the introduction of turbo codes and the rediscovery of the low-density parity check (LDPC) 
codes would completely alter the landscape. We shall introduce these codes in Chapter 14. 


13.6 PRACTICAL COMMUNICATION SYSTEMS IN 
LIGHT OF SHANNON'S EQUATION 


It would be instructive to determine the ideal law for the exchange between the SNR and 
the transmission bandwidth by using the channel capacity equation. Consider a message of 
bandwidth B that is used for modulation (or coding), with the resulting modulated signal of 
bandwidth Bj. This signal is received at the input of an ideal demodulator with signal and noise 
powers of S : and A'., respectively* (Fig. 13.9). The demodulator output bandwidth is B, and the 
SNR is S 0 /N 0 . Because an SNR S/N and a bandwidth B can transmit ideally B log (1 + S/N) 
bits of information, the ideal information rates of the signals at the input and the output of the 
demodulator are Bj log (1 S;/\\) bits and B log (I S 0 /N 0 ) bits, respectively. Because the 

demodulator neither creates nor destroys information, the two rates should be equal, that is, 

‘°8 (> + 1 )= a ‘°6 (! + !;) 

and 

o+iM-sr 

In practice, for (he majority of systems, S 0 /N 0 as well as Sj/A) » l, and 


Also, 


N 0 


otr 


Si _ Si 
Ni A r B T 

■(A)(1)-i' 


Si 


Y A (B 


Hence, Eqs + (13.70) become 


No 



(13.70b) 


(1371a) 

(13.71b) 


Figure 13,9 

Ideal exchange 
between SNR 
and bandwidth. 


S { ,N ( 

Ideal 


Bandwidth B r 

demodulator 

Bandwidth B 


* An additive white Gaussian channel noise is assumed. 



774 INTRODUCTION TO INFORMATION THEORY 


Figure 13.10 

Ideal behavior of 
SNR vs, y for 
various ratios of 
Bj to B. 



Equations (13-70) and (13.71) give the ideal law of exchange between the SNR and the band¬ 
width. The output SNR S 0 fN 0 is plotted in Fig* 13.10 as a function of y for various values of 
B t /B. 

The output SNR increases exponentially with the bandwidth expansion factor Bt/B. This 
means that to maintain a given output SNR, the transmitted signal power can be reduced 
exponentially with the bandwidth expansion factor. Thus, for a small increase in bandwidth, 
we can cut the transmitted power considerably. On the other hand, for a small reduction in 
bandwidth, we need to increase the transmitted power considerably. 

Let us now investigate how two digital systems fare in comparison to the ideal system. 


PCM 

As seen earlier, M-ary PCM shows a saturation effect unless we go to higher values of M as 
y increases. If the message signal is quantized in L levels, then each sample can be encoded 
by log M L number of M-ary pulses* If B is the bandwidth of the message signal, we need to 
transmit 2 B samples per second. Consequently, Rm , the number of M-ary pulses per second, is 

R m - 28 log M L 

Also, the transmission bandwidth Bj is half the number of (M-ary) pulses per second. Hence, 

B t = =Blog lW L () 3.72a) 

From Eq. (10.98a), the power S; is found as 

M 2 - 1 

Si = ‘^~ T l Ep R M (13.72b) 


Also, 


Ni ■■ MB t ~ 




2 


(13.73) 
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Each of the A/-ary pulses carries the information of log 2 M hits, and we are transmitting 
1 L number of A/-ary pulses per second. Hence, we are transmitting information at a 
rate of Rb bits, where 


Rb = (2B\og M L)(]og 2 M) 

= 2 Bj log 2 M 
— Bj log 2 M 2 bit/s 

Substitution of Eqs. (13.72b) and (13.73) into this equation yields 

l<„ = Brk, 82(1 + 5 —) bit/s (13.74) 

We are transmitting the information equivalent of R b binary digits per second over the A/-ary 
PCM channel. The reception is not error free, however. The pulses are detected with art error 
probability P,.\! given in Eq. (10.99c). If P, ,y is on the order of 1 0 -:i . we could consider the 
reception to be essentially error free. From Eq. (10.99c), 

Pm - 2 Q = 10 -6 M » 1 

This gives 

2Ep 
A" 

Substitution of this value in Eq. (13.74) gives 

R b = Bj log 2 

Thus, over a channel of bandwidth B T with an SNR of S,yjVj, a PCM system can transmit 
information at a rate of Rb in Eq. (13.75). The ideal channel with bandwidth Bj and SNR 
Si/Ni transmits information at a rate of C bit/s, where 

C = B t log 2 ^1 + ^ bit/s (13.76) 

It follows that PCM uses roughly eight times (9 dB) as much power as the ideal system. This 
performance is still much superior to that of FM. Figure 13.11 shows Rb/Bj as a function of 
SifNj, For the ideal system, w ? e have 

Rb c ( sA 

- = - =loe! ( I + -) (13.77) 

PCM at the threshold is 9 dB inferior to the ideal curve. 

When PCM is in saturation, the detection error probability approaches 0 . Each jtf-ary 
pulse transmits log 2 M bits, and there are 2 Bj pulses per second. Hence, 


15A 
8IV;/ 


bit/s 


(13.75) 


Rb — 2 B t log 2 M 


(13,78) 
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Figure 13-11 

Comparison of 
ideal system 
behavior to that 
of PCM. 



or 


Rb 

Bj 


= 2 log 2 M 


This is clearly seen in Fig. 13.11 (solid horizontal lines). 


(13.79) 


Orthogonal Signaling 

We have already shown that [Eq> (10.122)1 for M-ary orthogonal signaling, the error-free 
communication rate is 


*b< 1-44^ bit/s (13.80) 

A' 

We showed in Eq. (13.64) that this is precisely the rate of error-free communication over 
an ideal channel with infinite bandwidth. Therefore, as M ^ oo, the bandwidth of an M -ary 
scheme also approaches infinity, and its rate of communication approaches that of an ideal 
channel. 


13.7 FREQUENCY-SELECTIVE CHANNEL CAPACITY 

Thus far, we have limited the discussion of capacity to distortionless channels of finite band¬ 
width under white Gaussian noise. Such a channel model is suitable for application when 
channels are either flat or flat fading. In reality, we often face many types of complex channels. 
In particular, we have shown, in Chapter 12, that most wireless communication channels in 
the presence of significant multipath tend to be frequency-selective channels. We now take a 
look at the capacity of frequency-selective channels that do not exhibit a distortionless (flat) 
spectrum. 

First, consider a band-limited AWGN channel whose random output is 


y = H * x H- n 
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This channel has a constant gain of H across the bandwidth. Based on Eq. (13.63) this band- 
limited (low’pass) AWGN channel with bandwidth B has capacity 

C = B- log ^1 + |//| 2 ^ bit/s (13.81) 

in which S and N are the signal power and the noise power, respectively. Furthermore, in 
Chapter 4 and Chapter 9, we have demonstrated the equivalence of baseband and passband 
channels through modulation. Therefore, given the same noise spectrum and bandwidth, 
AWGN low-pass, band-limited channels and AWGN bandpass channels possess identical 
channel capacity. We are now ready to describe the capacity of frequency-selective channels. 

Consider a bandpass channel of infinitesimal bandwidth A f centered at a frequency f\. 
Within this small band, the channel gain is the signal power spectral density (PSD) is 
x if: >• and the Gaussian noise PSD is .S n if, } . Since this small bandwidth is basically a band 
limited AWGN channel, according to Eq. (13.63), its capacity is 


C(fi) = Af ■ log 


1 + |//tf)| 2 


SxViW 

S n (fi)A/_ 


= log 


1 + 


Snifi) J 1 


bit/s 


(13.82) 


This means that we can divide a frequency-selective channel H(f) into small disjoint 
AWGN bandpass channels of bandwidth A/. Thus, the sum channel capacity is simply 
approximated by 


c = Yl lo s 

i 


Snifi ) . 


A/ 


bit/s 


In fact, the practical OFDM (or DMT) system discussed in Chapter 12 is precisely such a 
system, which consists of a bank of parallel flat channels with different gains. This capacity 
is an approximation because the channel response, the signal PSD, or the noise PSD, may 
not be constant over a nonzero A/. By taking Af —- 0, we can determine the total channel 
capacity as 


C - 



I H(f)\ 2 Si(f) 
Snif) 


4f 


(13.83) 


Maximum Capacity Power Loading 

In Eq. (13.83), we have established that the capacity of a frequency-selective channel with 
response H(f) under colored Gaussian noise of power spectral density (PSD) S n (f) depends 
on the input PSD S y (f ). For the transmitter to utilize the full channel capacity, we now need 
to find the optimum input power spectral density (PSD) $*(/') that can further maximize the 
integral capacity 



m(ni 2 s x (o 

Snif) 


df 


To do so, we have noted that it would not be fair to consider arbitrary input PSD S x (f) because 
different power spectral densities may lead to different values of total input power. Given two 
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signals of the same PSD shape, the stronger signal with larger power has an unfair advantage 
and costs more to transmit. Thus, a fair approach to channel capacity maximization should 
limit the total input signal power to a transmitter power constraint P x > Finding the best input 
PSD under the total power constraint is known as the problem of maximum capacity power 
loading. 

The PSD that achieves the maximum capacity power loading is the solution to the 
optimization problem of: 




1 + -;‘ M ' W 

s n (f) ) 


max / log [ 

■W) oo V 

/ OQ 

Sx(f)df < 
-oo 


(13.84) 


To solve this optimization problem, we again partition the channel (of bandwidth 8) into K 
narrow flat channels centered at [fiJ — 1, 2, ..., of bandwidth A f = B/K ♦ By denoting 


= H(fi) 

A f 

the optimization problem becomes a discrete problem of 


A 

max > log 

l‘V/1 ^ 

?=1 

(>^)v 

(13.85a) 

K 


subject to 


(13.85b) 




The problem of finding the N optimum power values {£,} is the essence of the optimum power 
loading problem. 

This problem can be dealt by introducing a standard Lagrange multiplier k to form a 
modified objective function 


0(S\, S 2 . Sk) 




(13.86) 


Taking a partial derivative of G(S ], ..., Sk) with respect to Sj and setting it to zero, we have 


A f \Hj\ 2 

1 + \Hj\ 2 Sj/Nj Nj 

We rewrite this optimality condition into 

A f Nj 


Xln2 = 0 7=1,2. K 


k In 2 \Hj\ 


2 = S J 


j = 1, 2. K 
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By defining a new variable W = ()An2)~ [ r we ensure that the optimum power allocation 
among the K subchannels is 


S, = W Af - i = 1, 2.A: (13.87a) 

such that ^Si^P (13.87b) 

The optimum power loading condition of Eq. (73.87) is not quite yet complete because some 
Si may become negative if no special care is taken. Therefore, we must further constrain the 
solution to ensure that Si > 0 via 

Si = max ■ A/ - 0^ i=l,2 . K (13.88a) 

suehthat ^S, =P (13,88b) 

The two relationships in Eq. (13,88) describe the solution of the power loading optimization 
problem. We should note that there remains an unknown parameter W that needs to be specified. 
By enforcing the total power constraint £5'/ = /\ we can finally determine the unknown 
parameter W. 

Finally, we take the limit as A/ -► 0 and K -» oo. Since Si = S x (fj)Af and N t = 
S„(fi) A/, the optimum input signal PSD becomes 


S x (f) = max (w - -^AL, o) (13.89a) 

We note again that there is no closed-form solution given for the optimum constant W . Instead* 
the optimum W is obtained from the total input power constraint 

/ oo 

S x (f)df = P (13.89b) 

-oo 


or 


p= \ 

-Mf: «'-S„(/')/|fl(DI 2 >0} 



Sn(f) 


I df 


(13.89c) 


Substituting the optimum PSD Eq. (13.89) into the capacity formula will lead to the maximum 
channel capacity value of 


-L 


Cmax — / log 


FSB* 


(13.90) 


Water-Pouring Interpretation of Optimum Power Loading 

The optimum channel input PSD must satisfy the power constraint Eq. (13.89c). Once the 
constant W has been determined, the transmitter can adjust its transmission PSDtoEq. (13.89a), 
which will maximize the channel capacity. This optimum solution to the channel input PSD 
optimization problem is known as the water-filling or water-pouring solution. 5 
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Figure 13.12 

Illustration of 
water-pouring 
power allocation 
for maximizing 
frequency- 
selective channel 
capacity. 



The literal water-pouring interpretation of optimum PSD design is illustrated by Fig. 13.12. 
First, plot the frequency response S n (f)f\H(j) I 2 . This curve is viewed as shaped like the bottom 
of a water container. Consider the total power as a bucket of water with total volume P. We 
can then pour the entire bucket of water into the container to achieve equal water level. The 
final water level will be raised lo W when the bucket is empty. The depth of the water for every 
frequency / is the desired optimum PSD level S x (f) as specified in Eq. (13.89a). Clearly* 
when the noise PSD is large such that S a (f)/\H(f)\ 2 is high for some/, then there may be 
zero water poured at those points. In other words, the optimum PSD for these frequencies 
will be zero. Notice that a high value of S n (f)/\H(f)\ 2 means a low value of channel SNR 
\H(f)\ 2 /S n (f). Conversely, when S n (f)/\H(f)\ 2 is low or SNR is high, the optimum PSD 
value S x (f) should be kept high. In short* water-pouring power loading allocates more signal 
power to frequencies at which the channel SNR \H(f )\ 2 fS n (j ) is high and allocates little or 
no signal power to frequencies at which the channel SNR \H(f)\ 2 /S n {f) is low. 

This solution is similar, but not the same with the transmitter power loading for maximum 
receiver SNR in the DMT system discussed in Sec. 12.8. 


Optimum Power Loading in OFDM/DMT 

As the water-filling illustration shows, it is impossible to find a closed-form expression of W. 
Once P has been specified, an iterative water-filling algorithm can be used to eventually 
determine W and hence the optimum power loading PSD S x (f). Of course* the approach in 
practice to determine the water level W is by numerically solving for W. The numerical solution 
requires dividing the entire channel bandwidth into sufficiently small nonoverlapping bands 
of width A/. 

Indeed* for practical OFDM or DMT communication systems, the iterative water-filling 
algorithm is tailor-made to achieve maximum channel capacity. Maximum capacity can be 
realized for OFDM channels by allocating different powers S s to the different orthogonal 
subcarriers. In particular* the power allocated to subcarrier/ should be 


such that = /TThis optimum power allocation or power loading can be solved by adding 

incremental power to the subcarriers one at a time until ^ Si = P. 
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13.8 MULTIPLE-INPUT-MULTIPLE-OUTPUT 
COMMUNICATION SYSTEMS 

In the past decade, one of the important breakthroughs in wireless communications is the 
advent of multiple-mput-multiple-output (MIMO) technologies- In fact, both the Wi-Fi (IEEE 
802.1 In) standard and the WiMAX (IEEE 802* I6e) standard have incorporated MIMO trans¬ 
mitters and receivers (or transceivers). The key advantage of MIMO wireless communication 
systems lies in their ability to significantly increase wireless channel capacity without either 
requiring additional bandwidth or substantially increasing the signal power at the transmitter. 
Interestingly, the MIMO development originates from the fundamentals of information theory. 
We shall explain this connection here. 


13.8.1 Capacity of MIMO Channels 

Whereas earlier only a single signal variable was considered for transmission, we now deal 
with input and output signal vectors. In other words, each signal vector consists of multiple 
data symbols to be transmitted or received concurrently in MIMO systems. Consider a random 
signal vector x = (xi, x 2 , .. - * x,y) 7 * If the random signal vector is discrete with probabilities 


Pi =P(x = Xi ) i = 1, 2, ... 

then the entropy of x is determined by 


= Pi log pi (13.91) 

i 

Similarly, when x is continuously distributed with probability density function p(x\, 
y;. .... x.v). its differential entropy is defined by 

^(*) = “ J■■■ JPi* l. X2 .JCAOlogpCri, X2 . x,\’)dx], dxi . dxfj (13.92) 

Consider a real-valued random vector x consisting of N fi.d. Gaussian random variables. 
Let x have (vector) mean ti and covariance matrix 


C v = £{(x - fi)(\ - fi) T ] 

Its differential entropy can be found 5 to be 

1 

H{x)= -[N + \og(2jie) + log det (C*)] (13.93) 

Clearly, the entropy of a random vector is not affected by the mean It is therefore convenient 
to consider only the random vectors with zero mean. From now on, we will assume that 


// = E{x} = 0 
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Among all the real-valued random variable vectors that have zero mean and satisfy the condition 

C* = Cov(x, x) = £{xx 7 } 


we have 11 


max H{\) = ]- [N ■ \og(2jre) -h logdet (C x -}1. (13.94) 

Px&Y- Cov(x, X T )=C X 2 

This means that Gaussian vector distribution has maximum entropy among all real random 
vectors of the same covariance matrix. 

Now consider a flat fading MIMO channel with matrix gain H . The N x M channel matrix 
H connects the M x 1 input vector x and N x 1 output vector y such that 

y = H x-hw (13.95) 

where w is the N x 1 additive white Gaussian noise vector with zero mean and covariance 
matrix C w . As shown in Fig. 13.13, a MTMO system consists of M transmit antennas at the 
transmitter end and N receive antennas at the receiver end. Each transmit antenna can transmit 
to all N receive antennas. Given a fixed channel H of dimensions N x M (i.e., M transmit 
antennas and N receive antennas), the mutual information between the channel input and output 
vectors is 


/(x, y)=Hty)-H(y\x) (13.96a) 

= H(y) - H (H * x 4- w|x) (13.96b) 

Recall that under the condition that x is known, H ■ x is a constant mean. Hence, the conditional 
entropy of y given x is 


H( y|x) = H(H > x + w|x) = H{ w) (13.97) 

and 

/(x, y)=tf(y)-//(w) (13.98a) 

= H(y) - 1 [iV ■ log 2 (2jre) + logdet (C w )] (13.98b) 


Figure 13.13 

MIMO system 
with M transmit 
antennas and N 
receive 
antennas. 
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As a result, we can use the result of Eq. (13.94) to obtain 

max/(x, y) = maxtf(y) — I [JV • log 2 (2ne) + logdet(C w )] (13.99a) 

= l [N ■ log 2 (2ne) + log det (C y )] - [A ■ log, (2jre) + log det (C w )l 

( 13 . 9 %) 

= j [log det (C y ) - log det (C w )] 

= \ [ lo gdet(C y ■ C“')j (13.99c) 

Since the channel input x is independent of the noise vector w. we have 

C y = Cov(y, y) = H • C x H r + C w 
Thus, the capacity of the channel per vector transmission is 

C s = max/(x, y) 

= ^logdet^r+ffCxff^C; 1 ) (13.100) 

Given a symmetric low-pass channel with B Hz bandwidth, 2 B samples of x can be 
transmitted to yield provide channel capacity of 

C{H ) = B log det (/ + HC x H T C~ l ) 

= Blog det (/ + C x ff 7 ’C~ i ff) (13.101) 

where we have invoked the equality that for matrices A and B of appropriate dimensions, 
det (/ + A ■ B) = det (/ + B ♦ A), We clearly can see from Eq. (13.101) that the channel 
capacity depends on the covariance matrix C x of the Gaussian input signal vector. This result 
shows that, given the knowledge of the MIMO channel (H 7 C~ l H) at the transmitter, an 
optimum input signal can be determined by designing C x to maximize the overall channel 
capacity C(H). 

We now are left with two scenarios to consider: (1) MIMO transmitters without the 
MIMO channel knowledge and (2) MIMO transmitters with channel knowledge that allows 
C x to be optimized. We shall discuss the MIMO channel capacity in these two separate 
cases, 

13.8.2 Transmitter without Channel Knowledge 

For transmitters without channel knowledge, the input covariance matrix C s should be chosen 
without showing any preference. As a result, the default C s = a^I should be selected. In this 
case, the MIMO system capacity is simply 

C = Slog det (/ + a}H T C- l H} 


(13.102) 
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Consider the eigendecomposition of 

H r C~ l H = VDV U 


where U is a N x N square unitary matrix such that U ■ U !! = /,v and D is a diagonal matrix 
with nonnegative diagonal elements in descending order: 

D = Diag (d\ , d 2 , ■■ ^ 4, 0, 0) 

Notice th awi f > 0 is the smallest nonzero eigenvalue of H r C~ [ H whose rank is bounded by 
r < min{7V, M). Because det (/ +AB) — det (/ H-ZJA) and U H U =/,we have 


C = B log det (l +a;-UDU H } 

(13.103a) 

— B log det + al ■ l)U H U ^ 


= B log det ^/ + aloj 


?■ 

= /flog]""[(l +(T ~dl) 

1=1 

(13403b) 

/■ 

= s^iogd +a^di) 
i= l 

(13.103c) 

In the special case of channel noise that is additive, white, and Gaussian, then C w — 


and 


Vi 


Y2 


h t c~ 1 h = \h t h = \v 

al al 



V H (13.104) 


0 


where y is the /th largest eigenvalue of H r H , which is assumed to have rank r. Consequently, 
the channel capacity for this MIMO system is 

c= B iz '°s (■+( i3 - 105 ) 

In short, this channel capacity is the sum of the capacity of r parallel AWGN channels* Each 
subchannel SNR is ■ yi/a^. Figure 13.14 demonstrates the equivalent system that consists 
of r parallel AWCN channels with r active input signals Xj„ ... f x r . 

In the special case when the MIMO channel is so well conditioned that all its nonzero 
eigenvalues are identical y { = y, the channel capacity is 

c MIMO = r ■ B log 


(13.106) 
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Figure 13.14 

^Channel 

communication 

system 
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MIMO system 
without channel 
knowledge at the 
transmitter. 



Compared with the single-inpul-single-output channel for which H is a scalar such that r — 1, 
the S1SO channel capacity is simply 

1 + ^>3 (13.107) 

a w / 

Therefore, by applying MIMO transceivers, the channel capacity is increased to r times 
the capacity of the original single-input-single-output channel. This result strongly demon¬ 
strates the significant advantages of MIMO technology in providing much-needed capacity 
improvement for wireless communications. 

13.8.3 Transmitter with Channel Knowledge 

In a number of wireless communication systems, the transmitter may acquire the knowledge 
of the MIMO channel H f C~ ] H through a feedback mechanism. In this case, the transmitter 
can optimize the input signal covariance matrix C x to maximize the MIMO system capacity. ] 2 

First, we observe that the channel capacity of Eq. (13.101) can be increased simply by 
scaling the matrix C x with a large constant k. Of course, doing so would be effectively increas¬ 
ing the transmission powder k times and would be unfair. This means that to be fair, the design 
of optimum covariance matrix C x must be based on some practical constraint. In a typi¬ 
cal communication system, we know that a transmitter with higher signal power will lead to 
higher SNR and, hence, larger capacity. Therefore, similar to the water-pouring PSD design for 
frequency-selective channels, we should constrain the total transmission power of the MIMO 
transmitter by the transmitter power threshold P. 

To show how this power constraint would affect the input covariance matrix we first 
need to introduce the “trace” (Tr) operator of square matrices. Consider an M x M square 
matrix F whose element on the ith row and the jth column is denoted by F s j. Then the trace 
of the matrix F is the sum of its diagonal elements 


c SISO = B lo £ 


Tr(F) = £>,.,■ 

1=1 


(13.108) 


Since the trace operator is linear, it follows from the property of the expectation operator E{ \ 
LEq. (8.59)] that 


E[Yt(F)}=Tt(FIF)) (13.109) 

We now introduce a very useful property of the trace operator. If matrix products AB and 
BA are both square matrices of appropriate sizes, then they both have the same trace, that is, 


Tr (AB) =Tr (BA) 


(13.110) 
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This equality turns out to be very important. By applying Eq* (13.110), we know that for 
vector x 

x J x = Tr[x r x] (13.111a) 

= Tr[xx r j (13,111b) 

For the signal vector x = (xj, X 2 , .... xa/), we can apply Eqs. (13.109) and (13*111) to 
show that the average sum power of the signal vector x is 


M t M 

X>{x?}=£ f>r (13.112a) 

1=1 \i= 1 

= E )x 7 x) 

= E |Tr [xx T J} 

= Tr [/£ { xnJ' J ] 

= TrlC x j (13.112b) 


As a result, we have established that the power constraint translates into the trace constraint 

Tr (C s ) < P 

Therefore, given the knowledge of H T C~ l H at the transmitter, the optimum input signal 
covariance matrix to maximize the channel capacity is defined by 

max fllogdet(7 + C 3i ff 7 'C“ 1 /?') (13.113) 

C*, Tr(Cx)<,P 

This optimization problem is henceforth well defined* 

To find the optimum C x > recall the eigendecomposition 

H T C~ l H = udu h 

By applying the trace property of Eq. (13.110), we can rewrite the optimum covariance design 
problem into 

max Blogdet (7 + C X UDU H \ = max filogdet (7 4- U h C x Ud) (13.114) 

Tr(C*)</> ^ ' Tr(Cj<p ' ' 

Because covariance matrices are positive semidefinite (Appendix D.7), we can define a new 
positive semidefinite matrix 

C X = U H C X U (13.115) 
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According to Eq. (13.110} T we know that 

Tr [C*] = Tr \u H C x U J 
= Tr[c x £7[/"] 

= Tr[CVl 

= Tr[C x ] (13.116) 

In fact, Eq. (13.116) states that the traces of C* and C x are identical. This equality allows 
us to simplify the capacity maximization problem into 

C = max ZJ log det (i + U h C x UD\ 

C,-.TnC t )<p t / 

= _ max B log det (/ + C X D) (13.117a) 

c*.Tnc*)<p 

= _ max B log det (7 +£> ,/2 C x D !/2 ) (13.117b) 

c x \ Tr(C x )<p v > 

The problem of Eq, (13.117b) is simpler because D is a diagonal matrix. Furthermore, 
we can invoke the help of a very useful tool often used in matrix optimization known as the 

Hadamard inequality. 

Hadamard Inequality: Let a,: be the element of complex tt xn matrix A on the ith row 
and the jth column. A is positive semidefinite and Hermitian, that is, ( conj(A)) T = A. Then 
the following inequality holds: 

n 

det (A) < Y\ a H 
i= 1 

with equality if and only if A is diagonal 

We can easily verify that / + Z) 1/2 C K ZJ 1/2 is positive semidefinite because C K is positive 
semidefinite (Prob. 13.8-3). By invoking Hadamard inequality inEq. (13.117b), it is clear that, 
for maximum channel capacity we need 

D l ^ 2 C x D 1 ^ 2 = diagonal 

In other words, the optimum channel input requires that 

C s = D~ 1 ^ 2 ■ diagonal ■ D~ ] ^ 2 = diagonal (13.118) 

Equation (13.118) establishes that the optimum structure of C* is diagonal. This result greatly 
simplifies the capacity maximization problem. Denote the optimum structure covariance 
matrix as 

C x = diagonal (c u c 2 , ■ ■ ■ , c M ) 
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Then the capacity is maximized by a positive semideiinite matrix C x according to 

C = _ max Blogdet (i + D l/2 C x D i/2 ) (13.119a) 

cvTrrc,)^' v ' 

M 

= max B V log (14- adi) (13.119b) 

Y.£ l ci<p.c i >0 


In other words, our job is to find the optimum positive elements {c/} to maximize Eq. (13.119b) 
subject to the constraint ^ c; < P * 

Taking the Lagrangian approach, we define a modified objective function 

M / M \ 

g(ci, C2, ■ ■ cm) = S^log(l +Cidi) + k Ip - ) (13.120) 

i=l \ i= l / 

Taking derivative of the modified objective function with respect to cj (j — L 2, ..., M) and 
setting them to zero, we have 


or 


log e dj 

B-- - -f-X = 0 

1 + c jdj 


«-[ 


B 


A. In 2 


1 

dj] 


j= 1. 2. M 


j= 1. 2. M 


The optimum diagonal elements {c;} are subject to the constraints 


M 

l 

cj >0 j = 1, ..., M 

Similar to the problem of colored Gaussian noise channel power loading, we can define a 
water level W = Bf (A In 2). By applying the same iterative water-pouring procedure, we can 
find the optimum power loading (on each eigenvector) to be 

Ci = max (w - J, 0^ i = 1, 2. M (13.121a) 

with the total power constraint that 

M 

£> = /> (13.121b) 

i=\ 

The water-filling interpretation of the optimum power loading at a MIMO transmitter given 
channel knowledge can be illustrated (Fig. 13*15). 

The optimum input signal covariance matrix is therefore determined by 

Cx = V ■ Diag (ci, C 2 . Cm, 0.0) • V H 



1 3.9 MATLAB Exercises 789 


Figure 13,15 

Water-filling 
interpretation of 
MJMO trans¬ 
mission power 
loading based 
on channel 
knowledge. 



Subchannels (eigenvectors) 


Figure 13,16 

Water-pouring 
interpretation of 
the optimum 
Ml MO 

transmission 
power loading 
based on 
channel 
knowledge. 



Noise 


In other words, the input signal vector can be formed by a unitary transformation U after wc 
have found c; based on water pouring. In effect, a is the amount of power loaded on the ;th 
column of £/, that is, the /th eigenvector of H r C~ } H> 

Suppose we would like to transmits independent signal streams .s , 2 , A HI }ofzero 
mean and unit variance. Then the optimum MIMO channel input can be formed via 


diag(^/r7, ^/c2, ■ ■ ■ > ^m) + 


J] 

$2 


(13.122) 


where U\ are the first m columns of U< Figure 13.16 is the block diagram of this optimum 
MIMO transmitter, which will maximize channel capacity based on knowledge of the MIMO 
channel. The matrix multiplier U } diug(^/c7, yfei, ■ ■ ■, at the transmitter is known as 

the optimum linear precoder. 


13.9 MATLAB EXERCISES 

In this section, we provide MATLAB exercises to reinforce the concepts of source coding and 
channel capacity in this chapter. 


COMPUTER EXERCISE 13.1: HUFFMAN CODE 

The first program, huff mane ode . m, is a Huffman encoder function. The user need only supply a 
probability vector that consists of all the source symbol probabilities. The probability entries do not need 
to be ordered. 
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function [huffcode,n]=huffmaneode(p); 

% input p is a probability vector consisting of 
% probabilities of source symbols x_i 
if min(p)<0, 

error('Negative element cannot be in a probability vector') 
return 

else if abs(sum(p)-1)>1.e-12, 

error('Sum of input probability is not 1') 
return 

end 

[psort,pord]=sorttp ); 

n=length(p); 
q=p; 

for i=1:n-1 

[q,1]=sort(q); 

m(i, :) = [1(1:n-i + 1},zeros f1, i-1) ] ; 
q=[q(l)+q(2},q(3:end),1]; 

end 

Cword=blanks (n"2 ) ; 

Cword (n) =' 0 ' ; 

Cword(2 *n) = ' 1' ; 

for il=l:n-2 

Ctemp=Cword; 

idx0=find(iMn-il,:)==1}*n; 

Cword(1:n)=[Ctemp(idxG-n+2:idxO) '0']; 

Cword(n+1:2 *n)-[Cword(1:n-l) '1']; 

for i2=2:il+l 

idx2=find(m(n-il,:)==i2); 

Cword(i2 *n+l: (i2 + l)*n)=Ctemp fn*(idx2-l)+1:n*idx2); 

end 

end 

for i=l:n 

idxl=find(m( 1,:)-=i) ? 

huffcode(i,1 in)=Cword(n*(idxl-1)+1:idxl*n); 

end 

end 


The second program, huf fmanEx .m, generates a very simple example of Huffman encoding. In 
this exercise, we provide an input probability vector of length 8. The MATLAB program huf fmanEx + m 
will generate the list of codewords for all the input symbols. The entropy of this source H (x) is computed 
and compared against the average Huffman codeword length. Their ratio shows the efficiency of the 
code. 

% Matlab Program <huffmanEx.m> 

% This exercise requires the input of a 
% probability vector p that list all the 
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% probabilities of each source input symbol 
clear; 

p=[0.2 0.05 0*03 0.1 0.3 0*02 0.22 0.08]; %Symbol probability vector 
[huffcode,n]=huffmancode(p) ; %Encode Huffman code 

entropy=sum(-log(p))/log(2); %Find entropy of the source 

% Display the results of Huffman encoder 
display(['symbol',' --> ',' codeword",' Probability']) 
for i=l:n 

codeword_Length {i) =n-length {find [abs (huff code (i , > > ==32 } ) ; 

display[ [ 'x' ,num2str(i),' --> ' r huffcode(i,:),' ',num2str(p(i))]) 

end 

codeword_Length 

avg_length=codeword_Length*p'; 

display(['Entropy = ', num2str[entropy)]) 

display!['Average codeword length = num2str(avg_length)J) 

By executing the program huffmanEx.m, we can obtain the following results. 


hu f fmanEx 

symbol 

- -> 

xl 

--> 

x2 

--> 

x3 


x4 

— > 

x5 

-> 

x6 

— > 

x7 


x8 

- -> 


codeword 

00 

10111 

101101 

100 

11 

101100 

01 

1010 


Probabili 

0.05 

0*03 

0*1 

0*3 

0*02 

0.22 

0.08 


ty 


0.2 


codeword_Length = 


2 5 6 3 2 


6 2 4 


Entropy = 2 * 5705 

Average codeword length = 2*61 


COMPUTER EXERCISE 13.2: CHANNEL CAPACITY AND MUTUAL INFORMATION 

This exercise provides an opportunity to compute the single-input-s ingle-output channel capacity under 
additive white Gaussian noise. 

MATLAB program mutualinfo .m contains a function that can compute the average mutual 
information between two data sequences x and y of equal length. We use a histogram to estimate the joint 
probability density function p(x> y) before calculating the mutual information according to the definition 
of Eq. (13,45a). 

function muinfo_bit=mutualinfo(x,y) 

%mutualinfo Computes the mutual information of two 
% vectors x and y in bits 

% muinfo_bit = mutualinfo(X,Y) 

% 

% output 
% X, Y 


mutual information 

The 1-D vectors to be analyzed 



792 INTRODUCTION TO INFORMATION THEORY 


minx=min(x) ; 
maxx=max{x); 

del tax= (maxx-minx) / (length (x)-1) ; 
lowerx=minx-deltax/2; 
upperx^maxx-ndeltax/2; 
ncellx=ceil(length(x) (1/3)) ; 

miny=min (y); 
maxy=max(y) ■ 

deltay=(maxy-miny)/(length(y)-1); 
lowery=miny-deltay/2; 
uppery=maxy+deltay/2; 
ncelly^ncellx; 

rout(1:ncellx,1:ncelly)=0; 

xx=round( (x-lowerx)/(upperx-lowerx)*ncellx + 1/2 }; 
yy=round( (y-lowery)/(uppery-lowery)*ncelly + 1/2 } ; 

for n=l:length(x) 
indexx=xx(n); 
indexy=yy(n); 

if indexx >= 1 & indexx <= ncellx & indexy >= 1 & indexy <= ncelly 
rout(indexx,indexy}=rout(indexx,indexy)+1; 
end? 
end? 

h = r ont; 

estimate=D; 
sigma=0; 
count=0; 

% determine row and column sums 

hy=sum(h); 
hx=sum(h') ; 

for nx=l:ncellx 
for ny=l:ncelly 
if h(nxjny)"=0 

logf=logfh(nx,ny)/hx(nx)/hy(ny)); 
else 

logf = 0 ; 
end; 

c ount = c ount+h (nx f ny) ; 
est imate=est imate+h. (nx, ny) * logf ; 
sigrna = sigma+h (nx, ny)*logf"2; 
end; 
end; 

% biased estimate 
estimate=estimate/count; 

sigma =sqrt( (sigma/count-estimate"2)/(count-1) ); 
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estimate=estiniate + log (count) ; 

nbias =(ncellx-1)* fncelly-1)/(2*count); 

% remove bias 

muinfo_bit=(estimate-nbias)/log{2); 

In the main MATLAB program, capacity_plot ,m, we calculate AWGN channel capacity for 
S/N ratio of 0, 5, 10, 15, and 20 dB. The channel capacity under different SNRs is plotted in Fig, 13.17. 
In addition, we can test the mutual information I (x, y) between the channel input x and the corresponding 
channel output y under the same SNR levels. 

In this program, we estimate /{x, v) for five different zero-me an input signals of unit variance: 

* Gaussian input 

' Binary input of equal probability 

* PAM-4 input of equal probability 

■ PAM-8 input of equal probability 

■ Uniform input in interval [-V3, V3] 

The corresponding mutual information /(x, y) is estimated by averaging over 1,000,000 data samples, 

% Matlab program <capacity_plot.m> 
clear;clf; 

Channel_gain=l ; 

H=Channel_gain; % AWGN Channel gain 
SNRdb= 0:5:20; % SNR in dB 

L=1000 0 0 0; 

SNR=10."(SNRdb/10)? 

% Compute the analytical channel capacity 
Capacity=l/2 * log (l-rH*SNR) / log (2) ; 

% Now to estimate the mutual information between the input 
% and the output signals of AWGN channels 

for kk=l:length fSNRdb ), 
noise-randn(L,1>/sqrt{SNR(kk)); 


Figure 13.17 

Channel 
capacity 
compared with 
mutual 
information 
between channel 
output and 
different input 
signals. 



SNR, dB 
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x=randn(L, 1}; 
xl^sign(x); 

x2 =(floor(rand(L,1)*4-4.e-IQ)*2-3)/sqrt(5); 
x3=(floor(rand(L,1)*8^4.e-10)*2-7)/sqrt(21); 
x4 =(rand{L,1}-0,5)*sqrt(12) ; 


muinfovec[kk,1)=mutualinfo(x,x+noise ); 
muinfovec (kk, 2 ) =mutualinf o (xl, xl + noise) 
muinfovec (kk H 3 ) =mutualinfo (x2 , x2-Hnoise) 
muinfovec (kk, 4) =mutualinfo [x3 , x3-i-noise} 
muinfovec(kk,5)=mutualinf□[x4,x4+noise) 
end 

plot(SNRdb,Capac1ty H 'k-d');hold on 


.1) 
, 2 ) 
, 3) 
r 4 ) 
r5) 


plot(SNRdb,muinfovec{:. 
plot{SNRdb,muinfovec(: 
plot{SNRdb,muinfovec{i 
plot(SNRdb,muinfovec(: 
plot(SNRdb,muinfovec(: . 
xlabel('SNR (dB) ');ylabel{ r mutual information 
legend{'Capacity', 'Gaussian','binary','PAM-4 1 , 
'uniform', 'Location','Northwest') 
hold off 


' k-o ' 
' k-s ' 
'k-v' 
' k-x' 
r k-*' 


%Gaussian input 
% Binary input ( -1,+1) 

% 4-PAM input (-3,-1,1,3) 

% 8-PAM input (-7 r -5,-3,-1, 
% Uniform input(-0.5,0 * 5) 


1,3,5 


(bits/sample) 
'PAM-8',... 


The estimated mutual information is plotted against the channel capacity under different SNR 
for the five different input distributions: (l) Gaussian; (2) binary (±1); (3) 4-level PAM (or PAM-4); 
(4) 8-level PAM (or PAM-8); (5) uniform. All five symmetric distributions are sealed to have the same 
zero mean and unit power (variance). As shown in Figure 13.17, the mutual information achieved by 
Gaussian input closely matches the theoretical channel capacity. This result confirms the conclusion 
of Sec. 13.5 that Gaussian channel input achieves channel capacity. Fig. 13.17 shows that the mutual 
information for all other channel inputs falls below the mutual information achieved by the Gaussian 
input. Among the five different distributions, binary input achieves the least mutual information, whereas 
the mutual information of PAM-8 input is very dose to the channel capacity for the SNR below 20 dB. 
This observation indicates that higher mutual information can be achieved when the distribution of the 
channd input is closer to Gaussian. 


COMPUTER EXERCISE 13.3: MIMO CHANNEL CAPACITY 

We show in this exercise how MIMO channel capacity varies for different numbers of transmit antennas 
and receive antennas. The MATLAB program mimocap. m will calculate the theoretical MIMO capacity 
of 200 random MIMO channels of different sizes at an SNR of 3 dB. We consider the case of a transmitter 
that does not have the MIMO channd knowledge. Hence, each transmit antenna is allocated the same 
signal power <r*. Additionally, the channel noises are assumed to be independent additive white Gaussian 
with variance 

The entries in the MIMO channel matrix H are randomly generated from Gaussian distribution of 
zero mean and unit variance. Because the channels are random, for M transmit antennas and N receive 
antennas, the MIMO capacity per transmission is 

In + 4 hh t 
<t£. 

Because the entries in the MIMO channd matrix H are randomly generated, its corresponding capacity 
is also random. From the 200 channels, each IVxM MIMO configuration should generate 200 different 
capacity values. 
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% Matlab Program <mimocap*m> 

% This program calculates the capacity of random MIMO (mxn) channels 
% and plots the cumulative distribution {CDF) of the resulting 
% capacity; 

% Number of random channels: K=200 

% Signal to noise ratio: SNRdb=3dB 

clear 
hold off 
cl f 

K=200; 

SNRdb=3; 

SNR=1(T (SNRdb/10) ; 

m=1; n=1; % 1x1 channels 

for kk-l:K 

H=randn([m nj); %Random MIMO Channel 

capll(kk)=log(det(eye(n,n)+SNR*H'*H))/(2*log(2)); 

end 

[Nil,Cll]=hist(capll,K/10); %CDF of MIMO capacity 

m=2;n=2; % 2x2 channels 

for kk=l:K 

H=randn([m nj); %Random MIMO Channel 

cap22(kk)=log(det(eye(n,n)-+SNR*H'*H))/(2*log(2)); 

end 

[N22,C22]=hist(cap22,K/10); &CDF of MIMO capacity 

m=4;n^2; % 4x2 channels 

for kk=l;K 

H^randn{[m n]); %Random MIMO Channel 

cap42(kk)=log(det(eye(n,n)+SNR*FT*H))/(2*log(2}); 

end 

£N42,C42j=hist(cap42,K/10); %CDF of MIMO capacity 

m-2;n-4; % 4x2 channels 

for kk=l;K 

H=randn{Im n]); %Random MIMO Channel 

cap24(kk) :=log{det (eye (n,n) +SNR*H' *H))/{2*log(2)); 

end 

[N24,C24]=hist(cap24,K/10); %CDF of MIMO capacity 

m=4;n=4; % 4x2 channels 

for kk=l:K 

H=randn{[m n]); %Random MIMO Channel 

cap44(kk)=log(det(eye(n,n)+SNR*H H *H)}/(2*log(2)); 

end 

[W44, C44] =hist(cap44,K/10); %CDF of MIMO capacity 

m=8 ; n-4; % 4x2 channels 

for kk=l:K 

H=randn([m nj); %Random MIMO Channel 

cap84(kk)=log(det(eye(n, n)+SNR*H'*H))/[2*log(2)); 

end 

[N84,C84]=hist(cap34,K/lO); 


%CDF of MIMO capacity 
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m=8;n=8; % 4x2 channels 

for kk=l:K 

H=randn([m n]); %Random MIMO Channel 

cap88{kk)-log(det(eye(n,n}+SNR*H' *H) )/ (2*1og(2)); 

end 

[N88,C88l=hist(cap88,K/10); %CDF of MIMO capacity 
% Mow ready to plot the CDF of the capacity distribution 
plot(Cll,cumsum(Mil}/K, 'k-x' ,C2 2,cunsum(N22 ) /K, 1 k-o 1 , . - * 

C24,cumsum(N24]/K,'k-d",C42,cumsum(N42)/K, ' k-v',,.. 

C44 1 cumsum(M44) /K, ' k-s ' , CS4, cumsum(KfS4) /K, 'k-* ' ) ; 
legend('lxl f t '2x2', '2x4', '4x2', '4x4', '8x4', ' Location 1 „ r SouthEast r }; 
grid 

xlabelf'Rate or Capacity (bits/sample) for SNR=3dB')^ylabel('CDF'); 

% End of the plot 

In Fig. 13d 8, we illustrate the cumulative distribution function (CDF) of the channel capacity 
Cmimo 


Prob(CMiMO < 0 


of each MIMO configuration estimated from the 200 random channels. We computed the CDF of channel 
capacity for six different MIMO configurations: 1 x 1 ( 2 x 2,2 x 4,4 x 2,4 x 4 t and 8x4. The results 
clearly show that MIMO systems with more transmit and receive antennas will have CDF distributions 
concentrated at higher capacity or rate. For example, 2x2 MIMO systems will have capacity below 
4 bits/sample with a probability of 1. However, for 4 x 4 MIMO systems, the probability drops to only 
0,2, When considering 8 x 4 MIMO systems, the probability falls below 0,05. These numerical examples 
clearly demonstrate the higher capacity achieved by MIMO technologies. 
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PROBLEMS 


13*1-1 A message source generates one of four messages randomly every microsecond. The probabilities 
of these messages are 0.4, 0.3, 0,2, and 0.1. Each emitted message is independent of the other 
messages in the sequence. 

(a) What is the source entropy? 

(b) What is the rate of information generated by this source (in bits per second)? 

13.1- 2 A standard television picture is composed of approximately 300,000 basic picture dements 

(about 600 picture elements in a horizontal line and 500 horizontal lines per frame). Each of 
these elements can assume 10 distinguishable brightness levels (such as black and shades of 
gray) with equal probability. Find the information content of a television picture frame. 

13*1-3 A radio announcer describes a television picture orally in 1000 words from his vocabulary of 
10,000 words. Assume that each of the 10,000 w'ords in the announcer's vocabulary is equally 
likely to occur in the description of this picture (a crude approximation, but good enough to 
give an idea). Determine the amount of information broadcast by the announcer in describing 
the picture. Would you say the announcer can do justice to the picture in 1000 words? Is the 
old adage “A picture is worth a thousand words" an exaggeration or an understatement of the 
reality? Use data in Prob, 13.1-2 to estimate the information of a picture. 

13.1- 4 From the town of the Old North Church in Boston, Paul Revere's friend was to show one lantern 

it the British army began advancing overland and two lanterns if they had chosen to cross the 
bay in boats, 

(a) Assume that Revere had no w ay of guessing ahead of time what route the British might 
choose. How much information did he receive when he saw Nvo lanterns? 

(b) What if Revere were 90% sure the British would march overland? Then, how much 
information would the two lanterns have conveyed? 

13*1-5 Estimate the information per letter in the English language by various methods, assuming that 
each character is independent of the others. (This is not true, but is good enough to get a rough 
idea.) 

(a) In the first method, assume that all 27 characters (26 letters and a space) are equiprobable. 
This is a gross approximation, but good for a quick answer, 

(b) In the second method, use the table of probabilities of various characters (Table PI 3.1-5). 
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TABLE P.13.1-5 

Probability of Occurrence of Letters in the English 
Language 


Letter 

Probability 

-log/ 1 ; 

Space 

0.187 

2.46 

E 

0.1073 

3.22 

T 

0,0856 

3,84 

A 

0.0668 

3,90 

0 

0,0654 

3.94 

N 

0.0381 

4.11 

R 

0.0359 

4.16 

I 

0.0519 

4.27 

S 

0.0499 

4.33 

H 

0.04305 

4.54 

D 

0,03100 

5,02 

L 

0,02775 

5,17 

F 

0.02395 

5.38 

C 

0.02260 

5.45 

M 

0.02075 

5.60 

U 

0.02010 

5.64 

G 

0.01633 

5,94 

Y 

0,01623 

5,95 

P 

0,01623 

5,95 

W 

0.01620 

6,32 

B 

0.01179 

6.42 

V 

0.00752 

7.06 

K 

0,00344 

8.20 

X 

0.00136 

9.54 

J 

0.00108 

9.85 

Q 

0.00099 

9,98 

z 

0.00063 

10.63 


(e) Use Zipf s law relating the word rank to its probability. In English prose, if we order words 
according to the frequency of usage so that the most frequently used word (the) is word 
number 1 (rank I), the next most probable word (of) is number 2 (rank 2), and so on, then 
empirically it is found that /^(r), the probability of the rth word (rank r) is very nearly 


Now use Zipf s law to compute the entropy per word. Assume that there are 8727 words. 
The reason tor this number is that the probabilities P(r) sum to 1 for r from 1 to 8727. Zipf s 
law, surprisingly, gives reasonably good results. Assuming there are 5.5 letters (including 
space) per word on the average, determine the entropy or information per letter. 

13.2- 1 A source emits seven messages with probabilities 1/2, 1/4, 1/8, 1/16, 1/32, 1/64, and 1/64, 

respectively. Find the entropy of the source. Obtain the compact binary code and find the average 
length of the codeword. Determine the efficiency and the redundancy of the code. 

13.2- 2 A source emits seven messages with probabilities 1/3, 1/3, 1/9, 1/9, 1/27,1/27, and 1/27, respec¬ 

tively . Find the entropy of the source. Obtain the compact 3-ary code and find the average length 
of the codeword. Determine the efficiency and the redundancy of the code. 
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13*2-3 A source emits one of four messages randomly every microsecond. The probabilities of these 

messages are 0.5, 0.3, 0.1, and 0,1, Messages arc generated independently. 

(a) What is the source entropy? 

(b) Obtain a compact binary code and determine the average length of the codeword, the 
efficiency, and the redundancy of the code. 

(c) Repeat part (b) for a compact ternary code, 

13,2-4 For the messages in Prob. 13.2-1, obtain the compact 3-ary code and find the average length of 

the codeword. Determine the efficiency and the redundancy of this code, 

13*2-5 For the messages in Prob. 13.2-2, obtain the compact binary code and find the average length 

of the codeword. Determine the efficiency and the redundancy of this code. 

13*2-6 A source emits three equiprobable messages randomly and independently. 

(a) Find the source entropy, 

(b) Find a compact ternary code, the average length of the codeword, the code efficiency, and 
the redundancy. 

(c) Repeat part (h) for a binary code. 

(d) To improve the efficiency of a binary code, we now code the second extension of the source. 
Find a compact binary code, the average length of the codeword, the code efficiency, and 
the redundancy. 

13*4-1 A binary channel matrix is given by 


Outputs 


Inputs 


*1 


*2 


3 

_L 

10 


3 

9 

To 


This means Pylx(^ll^l) = 2/3, / I yjx(3 ? 2l Jc l) = 1/3, etc. You are also given that = 1/3 

and P x (x 2 ) = 2/3. Determine H(x\ H(x |y), H(y), //(yfx), and /(x; y). 

13*4-2 For the ternary channel in Fig, P13,4-2, P x Oq) = P, Pxfe) = Px&ii = Q.{Note\P+2Q = 1.) 


Figure Input Output 

P. 13.4-2 1 

*i*-—»■— - *y\ 


p 



p 
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(a) Determine Hixk Hi x y H(y), and /(x; y). 

(b) Show that the channel capacity C s is given by 

c, = l°g(^) (13.123) 

where $ = -?) k>£?t r 


13*4-3 Consider the binary symmetric channel shown in Fig. Pl3.4-3a. The channel matrix is given by 



Figure P13,4-3b shows a cascade of two such BSCs. 



(a) Determine the channel matrix for the cascaded channel in Fig. P13.4-3b. Show that this 
matrix is 

(b) If the two BSC channels in Fig. P13.4-3b have error probabilities P e \ and P&, with channel 
matrices M\ and M 2 , respectively: show that the channel matrix of the cascade of these two 
channels is 

(c) Use the results in part (b) to show that the channel matrix for the cascade of k identical 
BSCs each with channel matrix M is M k . Verify your answer for n = 3 by confirming the 
results in Example 8.7. 

(d) Use the result in part (c) to determine the channel capacity for a cascade of k identical BSC 
channels each with error probability P e , 

13.4-4 In data communication using error detection code, as soon as an error is detected, an automatic 
request for retransmission (ARQ) enables retransmission of the data in error. In such a channel, 
the data in error is erased. Hence, there is an erase probability p, but the probability of error is 
zero. Such a channel, known as a binary erasure channel (BEC), can be modeled as shown 
in Fig. P13.4’4. Determine tf(x), //(x|y), and /(x; y) assuming the two transmitted messages 
equiprobable. 

Figure 
P.13.4-4 
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13.4-5 A cascade of two channels is shown in Fig, PI 3,4-5. The symbols at the source, at the output of 
the first channel, and at the output of the second channel are denoted by x, y, and z, Show that 


//(x|7.)>/f(x]y) 


and 


/(*;v) >/(x; z) 


This shows that the information that can be transmitted over a cascaded channel can be no greater 
than that transmitted over one link. In effect, information channels tend to leak information. 

Hint: For a cascaded channel, observe that 

P(z k \y h Xi) = P( 7 . k \yj) 


Hence, by Bayes' rule. 


p <Hlyj*Zk) = Pixilyj) 


Figure 
M 3.4-5 


x y z 


13*5-1 For a continuous random variable x constrained to a peak magnitude M (-M < x < M), 
show that the entropy is maximum when x is uniformly distributed in the range 
and has zero probability density outside this range. Show that the maximum entropy is given 
by log 2 M. 

13*5-2 For a continuous random variable x constrained to only positive values 0 < x < oo and a mean 
value A, show' that the entropy is maximum when 

P*W = \e~ x > A u(x) 

A 

Show that the corresponding entropy is 

H(x) = log eA 

13*5-3 A television transmission requires 30 frames of 300,000 picture elements each to be transmitted 
per second. Use the data in Prob. 13.1-2 to estimate the theoretical bandwidth of the AWGN 
channel it the SNR at the receiver is required to be at least 50 dB, 

13.7-1 Tn a communication system over a frequency-selective channel with transfer function 


1 A-jnif /200) 


the input signal PSD is 


Sxif) - n 



The channel noise is AWGN with spectrum S^(f) = 10 Find the mutual information between 
the channel input and the channel output. 



A ERROR CORRECTING 
4+ CODES 


A seen from the discussion in Chapter 13, the key to achieving error-free digital com¬ 
munication in the presence of distortion, noise, and interference is the addition of 
appropriate redundancy to the original data bits. The addition of a single parity check 
digit to detect an odd number of errors is a good example. Since Shannon's pioneering paper, 1 
a great deal of work has been carried out in the area of forward error correcting (FEC) codes. 
In this chapter, we will provide an introduction; readers can find much more in-depth coverage 
of this topic from the classic textbook by Lin and Costello. 2 


14.1 OVERVIEW 

Generally, there are two important classes of FEC codes: block codes and convolutional codes. 
In block codes, every block of k data digits is encoded into a longer codeword of n digits 
(n > k). Every unique sequence of k data digits fully determines a unique codeword of n 
digits. In convolutional codes, the coded sequence of n digits depends not only on the k data 
digits but also on the previous N — 1 data digits (N > 1). Hence, the coded sequence for a 
certain k data digits is not unique but depends also on N — 1 earlier data digits. In short, the 
encoder has memory. In block codes, k data digits are accumulated and then encoded into an 
n-digit codeword. In convolutional codes, the encoding is done on a continuous running basis 
rather than by blocks of k data digits. 

Shannon's pioneer work 1 on the capacity of noisy channels has yielded a famous result 
known as the noisy channel coding theorem. This result states that for a noisy channel with a 
capacity C, there exist codes of rate R < C such that maximum likelihood decoding can lead 
to error probability 


P e < 2~ nEhiR) (14.1) 

where E&(R) is the energy per information bit defined as a function of code rate This 
remarkable result shows that arbitrarily small error probability can be achieved by increasing 
the block code length n while keeping the code rate constant. A similar result for convolutional 
codes was also shown in Ref. L Note that this result establishes the existence of good codes. It 
does not, however, tell us how to find such codes. In fact, it is not simply a question of designing 
good codes. Indeed, this result also requires large n to reduce error probability and requires 
decoders to use large storage and high complexity for large codewords of size n. Thus, the key 
problem in code design is the dual task of searching for good error correction codes with large 
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length n to reduce error probability, as well as decoders that are simple to implement. The 
best results thus far are the recent discovery of turbo codes and the rediscovery of low-density 
parity check (LDPC) codes, to be discussed later. The former are derived from convolutional 
codes, whereas the latter are a form of block code. 

Error correction coding requires a strong mathematical background. To provide a suf¬ 
ficiently detailed introduction of various important topics on this subject, we organize this 
chapter according to the level of mathematical background necessary for understanding. We 
begin by covering the simpler and more intuitive block codes that require the least amount of 
probability analysis. We then introduce the concepts and principles of convolutional codes and 
their decoding. Finally, we focus on the more sophisticated soft-decoding concept, which lays 
the foundation for the subsequent coverage of recent progresses on high-performance turbo 
codes and low-density parity check codes. 

14.2 REDUNDANCY FOR ERROR CORRECTION 

In FEC codes, a codeword is a unit of bits that can be decoded independently. The number of 
bits in codeword is known as the code length. If k data digits are transmitted by a codeword 
of n digits, the number of check digits is m - n - k. The code rate is R = k/n. Such a code 
is known as an (n,k) code. Data digits (^ 1 ,^ 2 * ♦ ■, *4) are a k -tuple, and, hence, this is a 
dimensional vector d> Similarly, a codeword (c\,C 2 , ..., c n ) is an rc- dimensional vectors. 
As a preliminary, we shall determine the minimum number of check digits required to detect 
or correct t number of errors in an («, k) code. 

If the binary code length is n , then a total of 2 n code words (or vertices of an rc-dimensional 
hypercube) is available to assign to 2 k data words. Suppose we wish to find a code that will 
correct up to t wrong digits. In this case, if we transmit a data word dj by using one of the 
codewords (or vertices) Cy, then because of channel errors the received word will not be cj but 
will be c f j. If the channel noise causes errors in t or fewer digits, then c ! . will lie somewhere 
inside the Hamming sphere* ol radius t centered at cy. If the code is to correct up to t errors, 
then the code must have the property that all the Hamming spheres of radius t centered at the 
codewords are nonoverlapping. This means that we must not use vertices as codewords that are 
within a Hamming distance X from any codeword. If a received word lies within a Hamming 
sphere of radius t centered at cy, then we decide that the true transmitted codeword was Cj, 
This scheme is capable of correcting up to t errors, and d m i n , the minimum distance between 
t error correcting codewords without overlapping, is 

^min = 2t + 1 (14.2) 

Next, to find a relationship between n and we observe that 2 n vertices, or words, are 
available for 2^ data words. Thus, there are 2 n — 2 k redundant vertices. How many vertices, 
or words, can lie within a Hamming sphere of radius f ? The number of sequences (of n digits) 
that differ from a given sequence by j digits is the number of possible combinations of n things 
taken/ at a time and is given by ^ [Eq. (8.16)]. Hence, the number of ways in which up to 
t errors can occur is given by 



* See Chapter 13 for definitions of Hamming distance and Hamming sphere. 
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Thus for each codeword, we must leave 


£ 

./=l 



vertices (words) unused Because we have 2 k codewords, we must leave a total of 



words unused. Therefore, the total number of words must at least be 


2 k + 2 k Y, 

y=t 




But the total number of words, or vertices, available is 2 M . Thus, we require, 


or 



(14,3a) 


Observe that n — k = m is the number of check digits. Hence, Eq. (14.3a) can be expressed as 



(14.3b) 


This is known as the Hamming bound. It should also be remembered that the Hamming bound 
is a necessary but not a sufficient condition in general. However, for single-error correcting 
codes, it is a necessary and sufficient condition. Tf some m satisfies the Hamming bound, it does 
not necessarily mean that a r-error correcting code of n digits can be constructed. Table 14.1 
shows some examples of error correction codes and their rates. 

A code for which the inequalities in Eqs. (14.3) become equalities is known as a perfect 
code. In such a code the Hamming spheres (about all the codewords) not only are nonover¬ 
lapping but they exhaust all the 2 n vertices, leaving no vertex outside some sphere. An e-error 
correcting perfect code satisfies the condition that every possible (received) sequence is at a 
distance at most e from some codeword. Perfect codes exist in only a comparatively few cases. 
Binary, single-error correcting, perfect codes are called Hamming codes. For a Hamming 
code, t — 1 and d m [ n = 3, and from Eq. (14.3b) we have 


2 ;tl = 



1 +B 


(14.4) 
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TABLE 14,1 

Some Examples of Error Correcting Codes 



n 

k 

Code 

Code Efficiency 
(or Code Role) 

Single-error correcting, t = 1 

3 

1 

(3, 1) 

0.33 

Minimum code separation 3 

4 

1 

(4, l) 

0.25 


5 

2 

(5. 2) 

0.4 


6 

3 

(6, 3) 

0.5 


7 

4 

(7,4) 

0.57 


15 

11 

(15, 11) 

0,73 


31 

26 

(31,26) 

0,838 

Double-error correcting, t = 2 

10 

4 

(10, 4) 

0.4 

Minimum code separation 5 

15 

8 

(15, 8) 

0.533 

Triple-error correcting, t — 3 

10 

2 

(10, 2) 

0.2 

Minimum code separation 7 

15 

5 

(15, 5) 

0,33 


23 

12 

(23, 12) 

0.52 


and 


n = 2 m - 1 


Thus, Hamming codes are ( n , k) codes with n = 2 m — 1 and k — 2 m — 1 — m and minimum 
distance d min = m. In general, we often write Hamming code as (2 m -1,2 m - [ - m, m) 
code. One of the most well-known Hamming codes is the (7, 4, 3) code. 

Another way of correcting errors is to design a code to detect (not to correct) up to / errors. 
When the receiver detects an emir, it can request retransmission. This mechanism is known as 
automatic repeat request (or ARQ). Because error detection requires fewer check digits, these 
codes operate at a higher rate (efficiency). 

To detect t errors, codewords need to be separated by a Hamming distance of at least 
r + 1. Otherwise, an erroneously received bit string with up to f error bits could be another 
transmitted codeword. Suppose a transmitted codeword cj contains a bit errors (a < r). Then 
the received codeword cj is at a distance of a from cj* Because o < f, however, c* can never 
be any other valid codeword, since all codewords are separated by at least t + 1. Thus, the 
reception of cj immediately indicates that an error has been made. Thus, the minimum distance 
^min between r error detecting codewords is 

dmm = f + 1 


In presenting coding theory, we shall use modulo-2 addition, defined by 


1 ®1 = o®o = o 

0 ® 1 = 1 ® 0=1 


This is also known as the exclusive OR (XOR) operation in digital logic. Note that the modulo- 
2 sum of any binary digit with itself is always zero. All the additions in the mathematical 
development of binary codes presented henceforth are modulo-2. 
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14.3 LINEAR BLOCK CODES 

A codeword consists of n digits c\> C2, ■ - ■, c n , and a data word consists of k digits d\, 

dk* Because the codeword and the data word are an «-tuple and a k-tuple, respectively, they 

are n- and dimensional vectors. We shall use row vectors to represent these words: 


c = (c u c 2 , ..«, c n) 
d = (d], d2, -<4) 

For the general case of linear block codes, all the n digits of c are formed by linear combinations 
(modulo-2 additions) of k data digits. A special case in which c\ =d\ t — d^, ..., — d& 
and the remaining digits from 1 to c H are linear combinations of d\ , .... d is known 

as a systematic code. In a systematic code, the leading k digits of a codeword are the data (or 
infonnation) digits and the remaining m = n — k digits are the parity check digits, formed by 
linear combinations of data digits d ), ^ 2 , ■ ■ ■, <4: 


Cl = d\ 

d — d2 


Ck = dk 

= h\\d\ © ^ 12^2 © ■ ‘ ■ © h}kdk 

Ck+ 2 = hi\d\ © h 2 idi © ■ -♦ © hikdk (14.5a) 


C tl — h m \d\ © © ' ' ' © fomkdk 


or 


c =dG 


(14.5b) 


where 


1 

0 

0 .. 

■■ 0 

All 

hi\ ■ 

■ * h m i 

0 

1 

0 .. 

■ 0 

Al2 

&22 ' 

' * h m 2 




0 


' 

0 

0 

0 

0 ■' 

'■ 1 

h\k 

h 2 k ■ 



' v v 

Ifcik x k ) P(k x m) 


(14.6) 


The kxn matrix G is called the generator matrix. For systematic codes, G can be partitioned 
into a k x k identity matrix/* and a k x m matrix P. The elements of Pare either 0 or 1. The 
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codeword can be expressed as 


c=dG 
= d[I k P ] 

= \d dP] 

= c p1 (14.7) 

where the check digits, also known as the checksum bits or parity bits, are 

c p = dP (14.8) 

Thus, knowing the data digits, we can calculate the check digits from Eq. (14.8) and conse¬ 
quently the codeword c p . The weight of the codeword c is the number of Is in the codeword. 
The Hamming distance between two codewords c a and c,. is the number of elements by which 
they differ, or 

d(£«, Ch) = weight of (c a @c h ) 


Example 14.1 For a (6, 3) code, the generator matrix G is 


G = 


l 0 0 
0 1 0 
0 0 1 


4 


i o i 

o i i 
1 l o 

p 


For all eight possible data words, find the corresponding codewords, and verify that this code 
is a single-error correcting code, 


{,■; Table 14.2 shows the eight data words and the corresponding codewords formed from 
% c = dG. 

TABLE 14.2 


fe Data Word d Codeword c 

& ^ 


>X. 

t* 

111 

111000 

i 

110 

110110 

i 

V, 

101 

101011 

% 

100 

100101 

1 

Oil 

011101 


010 

010011 

% 

001 

001110 

4 „ 

000 

000000 

a 


| Note that the distance between any two codewords is at least 3* Hence, the code can 
correct at least one error* The possible encoder for this code shown in Fig, 14.1 uses a 
y three-digit shift register and three modulo-2 adders. 
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Figure 14,1 

i 




- Q 

Encoder for 

£ 




Cl 

linear block 

f 




—PA 

codes. 

1 




/ V" 


£ 



/ 

\ CommuLaior 





—^-Q c-. 

\ 



DiUii inpul 


\ 

\ 


| 

-»- dy 

d 2 

\ 

b >- 


Linear Codes 

A block code is a linear block code if for every pair of codewords c,, and c/, from the 

block code, 

C a ®C b 

is also a codeword. For this reason, linear codes must have an all-zero codeword 000 * * * 00. 
For linear codes, the minimum distance equals the minimum weight. 

Decoding 

Let us consider some codeword properties that could be used for the purpose of decoding. 
From Eq. (14.8) and the fact that the modulo-2 sum of any sequence with itself is zero, we get 

d-P®c p ^[d c p ] ** =0 (14.9) 

c 

where I m is the identity matrix of order m x m (m = n - £). Thus, 

cH T = 0 (14.10a) 


where 


and its transpose 



(14.10b) 


(14.10c) 


is called the parity check matrix. Every codeword must satisfy Eq. (14,10a). This is our clue 
to decoding. Consider the received wordr. Because of possible errors caused by channel noise, 
r in general differs from the transmitted codeword c. 
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where the error word (or error vector) e , is also a row vector of n elements- For example, if 
the data word 100 in Example 14,1 is transmitted as a codeword 100101 (see Table 14-2), and 
the channel noise causes a detection error in the third digit, then 

r = 101101 
c - 100101 


and 


e = 001000 

Thus, an clement 1 in e indicates an error in the corresponding position, and 0 indicates no 
error. The Hamming distance between r and c is simply the number of Is in e. 

Suppose the transmitted codeword is Ci and the channel noise causes an error ei, making 
the received word r = + e,-. If there were no errors, that is, if e\ were 000000, then we would 

have rH = 0, But because of possible channel errors, is in general a nonzero row vector 
s, called the syndrome: 


s=rH r (14-11a) 

= (CiQeOH 7 
= c i H T ®e i H 7 

= eiH T (14- lib) 

Receiving r , we can compute the syndrome s [Eq. (14.11 a)] and presumably we can compute 
from Eq. (14.11b). Unfortunately, knowledge of $ does not allow us to solve uniquely. 
This is because r can also be expressed in terms of codewords other than Thus, 

r = c j © i ^ i 


Hence, 


s = (cj © ej)If T = ejH 7 
Because there are 2 k possible codewords. 


s = eH r 

is satisfied by 2 k error vectors. In other words, the syndrome s by itself cannot define a unique 
error vector. For example, if a data word d = 100 is transmitted by a codeword 100101 in 
Example 14.1, and if a detection error is caused in the third digit, then the received word is 
101101. In this case we have c — 100101 and e — 001000. But the same word could have 
been received if c = 101011 and e = 000110, or if c = 010011 and e = 111110, and so 
on. Thus, there are eight possible error vectors (2 k error vectors) that all satisfy Eq. (14.11b). 
Which vector shall we choose? For this, we must define our decision criterion. One reasonable 
criterion is the maximum likelihood rule according to which, if we receive /*, then we decide 
in favor of that c for which r is most likely to be received. In other words, we decide 
transmitted" if 


P(r\a) > P{r\c k ) aiu / i 

For a binary symmetric channel (BSC), this rule gives a very simple answer. Suppose the 
Hamming distance between r and a is d\ that is, the channel noise causes errors in d digits. 
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Then if P e is the digit, error probability of a BSC, 

F(r|c;) = P d ( X - P e )”~ d = (1 - PeT ( 7 ^) 

If P e < 0*5 holds for a reasonable channel, then is a monotonically decreasing function 

of d because P e /(\ — P £ ) < I. Hence, to maximize P(r|c,), we must choose thatc,- which is 
closest tor; that is, we must choose the error vector £ with the smallest number of Is. A vector 
with e the smallest number of is is called the minimum weight vector. This minimum weight 
error vector e min will be used to correct the error in r via 

c = r © C mm 


Example 1 4.2 A linear ( 6 , 3) code is generated according to the generating matrix in Example 14.1. The 
receiver receives r = 100011. Determine the corresponding data word if the channel is a BSC 
and the maximum likelihood decision is used. 


We have 


& = rH 


= [1 0 0 0 1 1 ] 


= [ 110 ] 


1 0 1 
0 1 1 
1 1 0 
J 0 0 
0 1 0 
0 0 1 


Because for modulo -2 operation, subtraction is the same as addition, the correct transmitted 
codeword c is given by 


c = r ® e 


where e satisfies 


s = [ 1 1 0 ] =eH l 


= [e\ €2 £4 ^61 


1 0 1 
0 1 1 
1 1 0 
1 0 0 
0 l 0 
0 0 1 


We see that e — 001000 satisfies this equation. But so does e = 000110, or 010101, or 
011011, or 111110, or 110000, or 101101, or 100011. The suitable choice, the minimum 
weight is 001000. Hence, 


c = 100011 © 001000 = 101011 



14,3 Linear Block Codes 


811 


TABLE 14.3 

Decoding Table for Code in Table 14.2 


e 

s 

000001) 

000 

100000 

101 

010000 

on 

001000 

110 

000100 

100 

000010 

010 

000001 

001 

100010 

111 


The decoding procedure just described is quite disorganized. To make it systematic, we 
would consider all possible syndromes and tor each syndrome associate a minimum weight 
error vector. For instance, the single-error correcting code in Example 14.1 has a syndrome with 
three digits. Hence, there are eight possible syndromes. We prepare a table of minimum weight 
error vectors corresponding to each syndrome (see Table 14.3). This table can be prepared by 
considering all possible minimum weight error vectors and using Eq, (14.11b) to compute s 
tor each of them. The first minimum weight error vector 000000 is a trivial case that has the 
syndrome 000. Next, we consider all possible unit weight error vectors. There are six such 
vectors: 100000,010000 , 001000, 000100,000010, 000001, Syndromes for these can readily 
be calculated from Eq. (14,11b) and tabulated (Table 14.3). This still leaves one syndrome, 
111, that is not matched with an error vector. Since all unit weight error vectors are exhausted, 
we must look for error vectors of weight 2. 

We find that lor (he first seven syndromes (Table 14.3), there is a unique minimum weight 
vector e. But for s = 111, the error vector e has a minimum weight of 2, and it is not unique. 
For example, e = 100010 or 010100 or 001001 all have s = 111, and all three e are minimum 
weight (weight 2). In such a case, we can pick any one e as a correctable error pattern. In 
Table 14.3, we have picked e = 100010 as the double-error correctable pattern. This means the 
present code can correct all six single-error patterns and one double-error pattern (100010). For 
instance, if c — 101011 is transmitted and the channel noise causes the double error 100010, 
the received vector r = 001001, and 


s =rH T = [111] 

From Table 14,3 we see that corresponding to s = 111 is e = 100010, and we immediately 
decide c — r ® e = 101011. Note, however, that this code will not correct double-error 
patterns except for 100010, Thus, this code corrects not only all single errors but one double¬ 
error pattern as well. This extra bonus of one double-error correction occurs because n and 
k oversatisfy the Hamming bound [Eq. (14,3b)]. In case n and k were to satisfy the bound 
exactly, we would have only single-error correction ability. This is (he case for the (7,4) code, 
which can correct all single-error patterns only. 

Thus for systematic decoding, we prepare a table of all correctable error patterns and the 
corresponding syndromes. For decoding, we need only calculate s ~ rH 7 and, from the decod¬ 
ing table, find the corresponding e. The decision is c = r ® e. Because s has m — n-k digits, 
there is a total of 2 n * syndromes, each consisting of n — k digits. There is the same number 
of correctable error vectors, each of n digits. Hence, for the purpose of decoding, we need a 
storage ol (2 n - k)2 n ^ k = (2 n — k)2 m bits. This storage requirement grows exponentially 
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with m, and the number of parity check digits can be enormous, even for moderately 
complex codes. 

Constructing Hamming Codes 

It is still not clear how to design or choose coefficients of the generator or parity check matrix. 
Unfortunately, there is no general systematic way to design codes, except for the few special 
cases such as cyclic codes and the class of single-error correcting codes known as Hamming 
codes. Let us consider a single-error correcting (7, 4) code. This code satisfies the Hamming 
bound exactly, and we shall see that a proper code can be constructed. In this case m = 3, and 
there are seven nonzero syndromes, and because n = 7, there are exactly seven single-error 
patterns. Hence, we can correct all single-error patterns and no more. Consider the single-error 
pattern e = 1000000. Because 

s = eH T 

eH 7 will be simply the first row of H 1 . Similarly, for e = 0100000, s = eH r will be the 
second row of// 7 , and soon. Now for unique decidability, we require that all seven syndromes 
corresponding to the seven single-error patterns be distinct. Conversely, if all seven syndromes 
are distinct, we can decode all the single-error patterns. This means that the only requirement 
on H t is that all seven of its rows be distinct and nonzero. Note that H 1 is an in x n - k) 
matrix (i.e., 7 x 3 in this case). Because there exist seven nonzero patterns of three digits, it is 
possible to find seven nonzero rows of three digits each. There are many ways in which these 
rows can be ordered. But we emphasize that the three bottom rows must form the identity 
matrix /„, [see Eq. (14.10b)]. 

One possible form of H T is 

"1 1 1 
1 1 0 
1 0 1 
if 7 " =011 
1 0 0 
0 1 0 
_0 0 1 

The corresponding generator matrix G is 

1 0 0 0 I 1 1 

0 10 0 1)0 

0 0 10 10 1 

0 0 0 1 0 1 1 

Thus when d = 1011, the corresponding code word e = 1011001, and so forth. 

A general linear i n. k) code has ^-dimensional syndrome vectors (m = n — k \. Hence, 
there are 2 m — 1 distinct nonzero syndrome vectors that can correct 2 m - 1 single-error patterns. 
Because in an («, k ) code there are exactly n single-error patterns, all these single errors can 
be corrected if 




2 m - 1 > n 




14.4 Cyclic Codes 813 


This is precisely the condition in Eq. (14.4) for t = L Thus, for any (n,k) satisfying this 
condition, it is possible to construct a single-error correcting code by the procedure discussed. 
To summarize, a (2 m - 1, 2 m - 1 - m, m) Hamming code has the following attributes: 


Number of parity bits 
Code length 
Number of message bits 
Minimum distance 
Error correcting capability 


m > 3 
n = 2 m - 1 
k = 2 m - m - 1 

^min = 3 
t — 1 


For more discussion on block coding, the readers should consult the books by Peterson 
and Weldon 3 and by Lin and Costello. 2 


14.4 CYCLIC CODES 

Cyclic codes are a subclass of linear block codes. As seen before, a procedure for selecting a 
generator matrix is relatively easy for single-error correcting codes. This procedure, however, 
cannot carry us very far in constructing higher order error correcting codes. Cyclic codes 
satisfy a nice mathematical structure that permits the design of higher order correcting codes. 
Second, for cyclic codes, encoding and syndrome calculations can be easily implemented by 
using simple shift registers. 

In cyclic codes, the codewords are simple lateral cyclic shifts of one another. For 
example, if c = (cj, q, ..., c„- 1, c n ) is a codeword, then so are ( C2 , C3, ,.., c nt c{)> 

C 4 , ..., c fl¥ ci, C 2 ), and so on. We shall use the following notation. If 

c = (ci, C 2 , .,., c n ) (14.12a) 

is a code vector of a code C, then denotes c shifted cyclically i places to the left, that is, 
c w = {c;+i,c ;+2 . C„, Cl, C2. c ( ) (14.12b) 

Cyclic codes can be described in a polynomial form. This property is extremely useful 
in the analysis and implementation of these codes. The code vector c in Eq. (14.12a) can be 
expressed as the {n — l)-degree polynomial 

c(A-) = C]JC M_1 +C 2 *" -2 H-h c n (14.13a) 

The coefficients of the polynomial are either 0 or 1, and they obey the following properties: 

0+0=0 0x0=0 

0+l=l+0=l 0 x1=1x 0 = 0 
1+1=0 1x1=1 

The code polynomial c i; iA ) for the code vector c (,) in Eq. (14.12b) is 


c' (0 (jO = c i+ ijc" 1 + c i+2 x n 2 + ■ ■ ■ + c n x l + c\x /_1 + - ■ ■ + a 


(14.13b) 
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One of the interesting properties of code polynomials is that when x'c(x) is divided by x 11 -b 1, 
the remainder is c^(x). We can verify this property as follows: 


xc(x) ~ c\x n + C2^~ X + ■ + c n x 

c\ 

x^ ~|~ 1 c\x n -h C 2 X n ~ l + - < + c^x 

-h Cl 

q / -1 + C3X n ~ 2 H- 1 -C n x + C] 

remainder 


The remainder is clearly c (1 ^(x). In deriving this result, we have used the fact that subtraction 
amounts to summation when modulo-2 operations are involved. Continuing in this fashion, 
we can show that the remainder of x*c(x) divided by x n + 1 is c^(x). 

We now introduce the concept of code generator polynomial g(x). Since each (n, k) 
codeword can be represented by a code polynomial 

C(JC) = Cl*" -1 + C2-^ -2 + ■■■+£:„ 

g(x) is a code generator polynomial (of degree n — k), if for a data polynomial d(x) of 
degree k — 1 


d{x) = d\x k 1 + d 2 X k 2 H-b d k 

we can generate code polynomial via 


c(x) = d(x)g(x) (14.14) 

Notice that there are 2 k distinct code polynomials (or codewords)* For cyclic code, a codeword 
after cyclic shift is still a codeword. 

We shall now prove an important theorem in cyclic codes: 

Cyclic Linear Block Code Theorem: /fg(x) is a polynomial of degree n — k and is 
a factor ofx n + 1 (modulo-2), then g(x) is a generator polynomial that generates an (; n , k) 
linear cyclic block code ♦ 

Proof: For a data vector (d \, ^ * *4)» the data polynomial is 

d(x) = dyx k ~ [ -h d 2 x k ~ 2 H-b dk (14.15) 

Consider k polynomials 


tfC*}, Xg{x) . X k l g(x) 

which have degrees n — k, n — i + 1,. -.,« — 1, respectively. Hence, a linear combination of 
these polynomials equals 


d \/ VW + d 2 x k 2 g(x) -h d k g(x) = d{x)g{x) 


(14.16) 
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Example 14.3 


Regardless of the data values {^/}, d i x j s (.vj still has degree n — 1 or less while being a 
multiple of g(x). Hence, a codeword is formed by using Eq. (14.16). There are a total of 2 k 
such distinct polynomials (codewords) of the data polynomial d(x), corresponding to 2 k data 
vectors. Thus, we have a linear ( n,k ) code generated by Eq. (14.14). To prove that this code is 
cyclic, let 


c(x) — C]X? * + cv? " + ■■■+■ c n 
be a code polynomial in this code [Eq. (14.16)J. Then, 

■ve'Oc) = c \x" + C 2 ^~ l H-h c„x 

= t’l(x + 1 ) + ( C2X n ' + C'3X n “ + '■■+ C n X + C’] ) 

= ci(x" + l) + c (l) (.x) 

Because xc(x) is xd(x)g(x), and g(x) is a factor of + l,c (l) (.v) must also be a multiple 
of g{x) and can also be expressed as c/(x)g(x) for some data vector d. Therefore, c : 11 (a ) 
is also a code polynomial. Continuing this way, we see that c (2) (x),(? (3) (x), ... are all code 
polynomials generated by Eq. (14.16). Thus, the linear {rt, k) code generated by ^(x)g(jr) is 
indeed cyclic. | 


Find a generator polynomial g(jt) for a (7,4) cyclic code, and find code vectors for the following 
data vectors: 1010,1111, 0001, and 1000. 

In this case n = 7 and n — k = 3, and 

x 1 + 1 = (x + l)(x 3 +x -I- l)(x 3 +x 2 + 1) 

■V; 

:■ For a (7, 4) code, the generator polynomial must be of the order n — k = 3An this case, 
there two possible choices for g (x) ; 4- * + 1 or x 3 + jc 2 4-1. Let us pick the latter, 

that is, 

% g(x) = x 3 +x 2 + 1 

f. as a possible generator polynomial. Ford = [1 0 1 0], 

% 

% d{x) = jc 3 +x 

i; 

d and the code polynomial is 

% C(x) = d(x)g(x) 

t = (x 3 +x)(x 3 +X 2 +\) 

—x 6 +x 5 +x A -\-x 

2 

% Hence, 


c = 1110010 
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| TABLE 14,4 


1 

d 

c 

1 

1010 

1110010 

w 

mi 

1001011 

■& 

■iki. 

0001 

0001101 

■i:¥: 

i£& 

np; 

1000 

1101000 


Similarly, codewords for other data words can be found (Table 14.4). Note the structure 
jj| of the codewords. The first k digits are not necessarily the data bits. Hence, this is not a 
1| systematic code. 


In a systematic code, the first k digits are data bits, and the last m = n — k digits are the 
parity check bits. Systematic codes are a special case of general codes. Our discussion thus far 
applies to general cyclic codes, of which systematic cyclic codes are a special case. We shall 
now develop a method of generating systematic cyclic codes. 

Systematic Cyclic Codes 

We shall show that for a systematic code, the codeword polynomial c(_v) corresponding to the 
data polynomial dix) is given by 


c{x) = x n ~ k d{x) + p(x) 


where /.mf is the remainder from dividing x n k d(x) by ef '.j. 


p(x) = Rem 


*"-*</(*) 

g(x) 


To prove this we observe that 


x n ~ k d (x) 

g(x) 


■■ q(x) + 


fi(x) 

g(x) 


(14.17a) 


(14.17b) 


(14.18a) 


where c/m is of degree k - 1 or less. We add p{x)/g{x) to both sides of Eq. (14,18a), and 
because +f(x) = 0 under moduio-2 operation, we have 


x " fe d(jc)+p(^) 
£{*) 


- q(x) 


(14.18b) 


or 


q(x)g(x) = x n ~ k d(x) + p{x) (14.18c) 

Because q{x) is on the order of k — 1 or less, q(x)g(x) is a code polynomial, As x n ~ k d(x) 
represents d(x) shifted to the left by n — k digits, the first k digits of this codeword are 
precisely d , and the last n — k digits corresponding to p(jt) must be parity check digits. This 
will become clear by considering a specific example. 
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Example 1 4,4 Construct a systematic (7, 4) cyclic code using a generator polynomial (see Example 14.3) t 


We use 

g(x) = x 3 + X 2 + \ 

Consider a data vector d = 1010, 

dlx) = X 3 +X 

and 

x n ~ k d{ x) = x 6 + x 4 

Hence, 


x 3 + x 2 + 1 


q(x) 


x 3 + x 2 + 1; 


Hence, from Eq + (14.17a), 


and 


x & + x 4 

X^ ~h X 5 + X~ _ 

x 5 + / + X 3 

x 5 + x 4 + X 2 


x 3 + X 2 
x 3 + X 2 + 1 


c(x) — x 3 4(x) + p(x) 
= x 3 (x 3 + X) + I 
= x 6 + x 4 + 1 


p{x) 


1010001 


:? We could also have found the codeword c directly by using Eq. (14.18c). Thus. c(x) = 
■ ■ ?(x)#(x) = (x 3 +x 2 + l)(x 3 +x 2 +1) = x 6 + x 4 + 1. We construct the entire code table in 

1-. this manner (Table 14.5). This is quite a tedious procedure. There is, however, a shortcut, 

; by means of the code generating matrix G. We can use the earlier procedure to compute the 
;i: codewords corresponding to the data words 1000,0100,0010,0001. These are 1000110, 
0100011,0010111,0001101. Now recognize that these four codewords are the four rows 
of G. This is because c = d G, and when d — 1000, d • G is the first row of G, and 
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TABLE 14.5 


D 1111 
1110 
£ HOI 

noo 
l; 1011 
% 1010 
I 1001 

pri 1000 

I 0111 
I 0110 
f 0101 
| 0100 

i ooii 
? 0010 
£ 0001 

ii oooo 


1111111 
1110010 
lioiooo 
1100101 
1011100 
1010001 
1001011 
1000110 
0111001 
0110100 
0101110 
010001l 
0011010 
0010111 
0001101 
0000000 


so on. Hence, 


v 1 0 0 0 110 

I 0 1 0 0 0 1 1 

■J ° - 0 0 1 0 1 1 1 

I [o 0 0 1 1 0 1_ 

■S' 

§ Now, we can use c = dG to construct the rest of the code table. This is an efficient 
# method because it allows us to construct the entire code table from the knowledge of only 
% k codewords. 

^ Table 14,5 shows the complete code. Note that d m i n , the minimum distance between 
% two codewords, is 3 + Hence, this is a single-error correcting code, and 14 of these code¬ 
ia words can be obtained by successive cyclic shifts of the two codewords 1110010 and 
|j 1101000. The remaining two codewords, 1111111 and 0000000, remain unchanged under 
% cyclic shift. 


Generator Polynomial and Generator Matrix of Cyclic Codes 

Cyclic codes can also be described by a generator matrix G (Probs* 14,3-6 and 14.3-7), It can 
be shown that Hamming codes are cyclic codes. Once the generator polynomial g(jt) has been 
given, it is simple to find the systematic code generator matrix G = [I P] by determining the 
parity submatrix P: 


x n-i 

1st row of P: Rem- 

g(x) 

2nd row of P; Rem—™ 

gto 


/:th row of P. Rem—— 

g(x) 


(14.19) 
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Consider a Hamming (7,4, 3) code with generator polynomial 

x 3 +x+l. (14.20) 

x 2 + 1 

x 2 + X + 1 
X 2 +x 


x+l 

Therefore, the cyclic code generator matrix is 

10 0 0 10 1 

0 10 0 11 1 

0 0 10 110 

0 0 0 1 0 1 l 

Correspondingly, one form of its parity check matrix is 

“1 110 10 0 “ 

H = 0 1 1 10 10 ( 14 . 22 ) 

110 10 0 1 


Cyclic Code Generation 

One of the advantages of cyclic codes is that their encoding and decoding can be implemented 
by means of such simple elements as shift registers and modulo-2 adders. A systematically 
generated code is described in Eqs. (14,17). It involves a division of by g(x) that can 

be implemented by a dividing circuit consisting of a shift register with feedback connections 

according to the generator polynomial* g(x) = x* - * + g { x n ^ k ~ l H-h g n -k-\* + 1* The 

gain gjt are either 0 or 1. An encoding circuit with n — k shift registers is shown in Fig. 14.2* An 
understanding of this dividing circuit requires some background in linear sequential networks* 
An explanation of its functioning can be found in Peterson and Weldon. 3 The k data digits 

Figure 14.2 

Encoder for 
systematic cyclic 
code. 


It can be shown that for cyclic codes, the generator polynomial must be of this form. 





g(x) = 


r- 

Rem- = 

Six) 

Rem-=, 

*(*) 

x 4 

Rem-=. 

g(x) 

x 3 

Rem-= r 

g(x) 
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are shifted in one at a time at the input with the switch $ held at position The symbol D 
represents a one-digit delay. As the data digits move through the encoder, they are also shifted 
out onto the output line, because the first/: digits of the codeword are the data digits themselves. 
As soon as the last (or £th) data digit dears the last [or (n — &)lhl register, all the registers 
contain the parity check digits. The switch s is now thrown to position p 2 , and the n — k parity 
check digits are shifted out one at a time onto the line. 


Decoding 

Every valid code polynomial c(x) is a multiple of g(x). In other words, c(x) is divisible by 
g(x). When an error occurs during the transmission, the received word polynomial r(x) will 
not be a multiple of g(x) if the number of errors in r is correctable. Thus, 


and 


r(x) s(x) 

= m\{x) + — 
g(x) g(x) 


(14.23) 


s(x) 


Rem 


rQc) 

'£(*> 


(14.24) 


where the syndrome polynomial s(x) has a degree n - k - 1 or less. 
If e(x) is the error polynomial, then 


r(x) = c(x) H- e(x) 


Remembering that c(x) is a multiple of g(x), 


s(x) = Rem 


= Rem 


= Rem 


4 *) 

g(x) 

c(x) 4- e(x) 
g(x) 
e{x) 
g(x) 


(14.25) 


Again, as before, a received word r could result from any one of the 2 k codewords and a suitable 
error. For example, for the code in Table 14.5, if r — 0110010, this could mean c = 1110010 
and e — 1000000, or c = 1101000 and e = 1011010, or 14 more such combinations. As seen 
earlier, the most likely error pattern is the one with the minimum weight (or minimum number 
of Is). Hence, herec = 1110010 and e = 1000000 is the correct decision. 

It is convenient to prepare a decoding table, that is, to list the syndromes for all correctable 
errors. For any r, we compute the syndrome from Eq. (14.24), and from the table we find the 
corresponding correctable error e. Then we determine c = r © e. Note that computation of 
s(x) [Eq. (14.24)] involves exactly the same operation as that required to compute £(x) during 
encoding [Eq. (14.18a)]. Hence, the circuit in Fig. 14.2 can also be used to compute yQO- 


Example 14.5 Construct the decoding table for the single-error correcting (7,4) code in Table 14.5. Determine 
the data vectors transmitted for the following received vectors r: (a) 1101101; (b) 0101000; 
(c) 0001100. 
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TABLE 14.6 


€ 

s 

1000000 

110 

0100000 

on 

ooioooo 

111 

0001000 

101 

0000100 

100 

0000010 

010 

0000001 

001 


The first step is to construct the decoding table. Because n - k - 1 = 2, the syndrome 
polynomial is of the second order, and there are seven possible nonzero syndromes. There 
are also seven possible correctable single-error patterns because n = l. We can use 

s = e ■ H t 

to compute the syndrome for each of the seven correctable error patterns. Note that 
(Example 14.4) 


H ~ 


10 1110 0 
1 1 I 0 0 1 0 

0 1 110 0 1 


We can now compute the syndromes based on H, For example, for e = 1000000, 

s = [10000001 H 7 
= 110 


In a similar way, we compute the syndromes for the remaining error patterns (see 
Table 14.6). 

When the received word r is 1101101, we can compute s(x'). either according to 
Eq. (14.24) or by simply applying the matrix product 

s = r-H T 
= [lUill0l]H T 
= 101 


Hence, From Table 14.6, this gives e = 0001000, and 

c = r © e = 1101101 © 0001000 = 1100101 
Because this code is systematic, 


d = 1100 

In a similar way, we determine for r = 0101000, s = 110 and e = 1000000; hence 
c = r © e = 1101000, and d = 1101. For r = 0001100, s = 001 and e = 0000001; 
hence c = r © e = 0001101, and d — 0001. 
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Bose-Chaudhuri-Hocquenghen (BCH) Codes and 
Reed-Solomon Codes 

The BCH codes are perhaps the best studied class of random error correcting cyclic codes. 
Moreover, their decoding procedure can be implemented simply. The Hamming code is a 
special case of BCH codes. These codes are described as follows: for any positive integers m and 
/ (t < 2 m ^ 1 ), there exists a r-error collecting (n,k) code with n — 2 m — 1 and n — k < mi. The 
minimum distance between codewords is bounded by the inequality 2rTl < d m \ n < 2f+2. 

As a special case of nonbinary BCH codes, Reed-Solomon codes are by far the most 
successful forward error correction (FEC) codes in practice today. Reed-Solomon codes have 
found broad applications in digital storage (DVD, CD-ROM), high-speed modems, broadband 
wireless systems, and HDTV, among numerous others. The detailed treatment of BCH codes 
and Reed-Solomon codes requires extensive use of modem algebra and is beyond the scope 
of this introductory chapter For in-depth discussion of BCH codes and Reed-Solomon codes, 
the reader is referred to the classic text by Lin and Costello. 2 

Cyclic Redundancy Check (CRC) Codes for Error Detection 

One of the most widely used cyclic codes is the cyclic redundancy check codes for detection 
of data transmission errors. CRC codes are cyclic, designed to delect erroneous data packets 
at the receivers (often after error correction). To verify the integrity of the payload data block 
(packet), each data packet is encoded by CRC codes of length n < 2 m — 1. The most common 
CRC codes have m =12, 16, or 32 with code generator polynomial of the form 

gto = (1 +JC)gcW 

g^fv) = generator polynomial of a cyclic Hamming code 

To select a code generator matrix, the design criterion is to control the probability of undetected 
error events. In other words, the CRC codes must be able to detect the most likely error patterns 
such that the probability of undetected errors 

p(eH T =0|e ^o) < e (14.26) 

where € is set by the user according to its quality requirement. The most common CRC codes 
are given in Table 14.7 along with their generator matrices. For each frame of data bits at 
the transmitter, the CRC encoder generates a frame-checking sequence (FCS) of length 8,12, 
16, or 32 bits for error detection. For example, the IEEE 802.11 and IEEE 802.11b packets 
are checked by the 16-bit sequence of CRC-CCITT, whereas the IEEE 802.11a packets are 
checked by the CRC-32 sequence. 


14.5 THE EFFECTS OF ERROR CORRECTION 

Comparison of Coded and Uncoded Systems 

It is instructive to compare the bit error probabilities (or bit error rate) when coded and uncoded 
schemes are under similar constraints of power and information rate. 

Let us consider a f-error correcting (n,k) code. In this case, k information digits are coded 
into n digits. For a proper comparison, we shall assume that k information digits are transmitted 
in the same time interval over both systems and that the transmitted power Si is also maintained 
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TABLE 14*7 

Commonly Used CRC Codes and Corresponding Generator 
Polynomials 


Code 

Number of Bits in FC$ 

Generator Polynomial #U) 

CRC-8 

8 

x 8 +X 1 +x b + x 4 +x 2 + 1 

CRC-12 

12 

X 12 -f X 11 + X 3 +X 2 + X + ] 

CRC-16 

16 

X 16 + j,1S +x 2 _|_ j 

CRC-CCITT 

16 

X i6 + x i2 + x 5 + 1 

CRC-32 

32 

X 32 + X 2(l + X 2 ^ + X 22 + x lci +x 12 + x il 
+x ,(1 + x 8 +x 7 +x 5 +x 4 +X 2 +X+ 1 


the same for both systems. Because only k digits are required to be transmitted in the uncoded 
system (versus n over the coded one), the bit rate is lower for the uneoded system than the 
coded one by a factor of k/n. This means that the bandwidth ratio of the uncoded system over 
the coded system is k/n. Clearly, the coded system sacrifices bandwidth for better reliability. 
On the other hand, the coded system sends n code bits for k information bits* To be fair, the 
total energy used by the n code bits must equal to the total energy used by the uncoded system 
for the/: information bits. Thus, in the coded system, each code bit has E& that is k/n times that 
of the uncoded system bit* We need to illustrate how error correction can reduce the originally 
higher bit error rate (BER) despite this loss of code bit energy. 

Let Ptm an d Pbc represent the raw data bit error probabilities in the uneoded and coded 
cases, respectively* For the uneoded case, the raw bit error rate is the final bit error rate P eL[ . 

For a f-error correcting {rc, k) code, the raw BER can be reduced because the decoder can 
correct up to t bit errors in every n bits* We consider the ideal case that the decoder will not 
attempt to correct the codeword w'hen there are more than t errors in n bits. This action of the 
ideal error correction decoder can reduce the average BER. Let P(j, n) denote the probability 
of j errors in n digits. Then the average number of bit errors in each codeword after error 
correction is 


n € — E{j bit errors in n bits } 

n 

= 04.27a) 

J=H-1 

Therefore the average BER after error correction should be 

P* = — (14.27b) 

n 

Because there are Q ways in which j errors can occur in n digits (Example 8.6), we have 

= Q^bcVd -Pbc)" - ' 
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Based on Eq, (14,27a) 


n*= 

y=f+i ' J ' 

Pcc = it ^(”)(Pb,) 7 (l -Phc )"“ 7 
j-!+ l " ^ 

= £ ("~ j)(Pbo) ; d -Pbc) ,i_; ' 


(14,28a) 


(14.28b) 




For Pbc ^ K the first term in the summation of Eq. (14.28b) dominates all the other terms, 
and we are justified in ignoring all but the first term. Hence, 


" t )(Pbc)' +1 (l 

-Pbc) n_l( + 1) 

(14.29a) 

"7 1 K ),+1 

for Pbc <K 1 

(14.29b) 


For further comparison, we must assume some specific transmission scheme. Let us con¬ 
sider a coherent PSK scheme. In this case, for an additive white Gaussian noise (AWGN) 
channel, 


and because E& for the coded case is kjn times that for the uncoded case, 


0 4.30a) 


Hence, 



(14.30b) 


(14.31a) 


(14.31b) 


To compare coded and uncoded systems, we could plot and P ec as functions of the raw 
E b /J\f (for the uncoded system). Because Eqs. (14.31) involve parameters t t n t and k, a proper 
comparison requires families of plots. For the case of a (7, 4) single-error correcting code 
(f = = 7, and k — 4),/\* and /\>u in Eqs. (14.31) are plotted in Fig. 14.3 as a function 

of Eb/Af. Observe that the coded scheme is superior to the uncoded scheme at higher E b /Af, 
but the improvement (about 1 dB) is not too significant. For large n and k , however, the coded 
scheme can become significantly superior to the uncoded one. For practical channels plagued 
by fading and impulse noise, stronger codes can yield substantial gains, as shown in our next 
example. 
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Figure 14.3 

Performance 
comparison of 
coded (doshed) 
and uncoded 
(solid) systems. 



It should be noted that the coded system performance of Fig. 143 is in fact a slightly 
optimistic approximation- The reason is that in analyzing its bit error rate, we assumed that 
the decoder will not take any action when the number of errors in each codeword exceeds r. 
In practice, the decoder never knows how many errors are in a codeword. Thus, the decoder 
will always attempt to correct the codeword, assuming that there are no more than f bit errors. 
This means that when there are more than t bit errors, the decoding process may even increase 
the number of errors. This counterproductive decoding effect is more likely when P e is high 
at low Eb/Af. This effect will be shown later in Sec. 14.13 as a MATLAB exercise. 


Example 14.6 Compare the performance of an AWGN BSC using a single-error correcting (15, 11} code 
with that of the same system using uncoded transmission, given that Eb/Af — 9,0946 for the 
uncoded scheme and coherent PSK is used to transmit the data. 


From Eq. (14.31b), 


Pm = g(Vl8.1892) = 1.0 x 10“ s 


= 14 Q^/~(18,1892)^ 

= 14(1.3 x 1(T 4 ) 2 = 2.03 x 10 -7 


and from Eq. (1431a), 
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g Note that the word error probability of the coded system is reduced by a factor of 50. 
p On the other hand, if we wish to achieve the error probability of the coded transmission 
(2*03 x 10 -7 ) by means of the imcoded system, we must increase the transmitted power. 
1 If E f b is the new value of £y, to achieve /^ LL = 2.03 x 10 -7 , 




= 2.03 x 1.0" 7 


|| This gives E ! h /j\ r = 13.5022, This is an increase over the old value of 9,0946 by a factor 
I of 1 >4846, or 1.716 dB, 


Burst Error Detecting and Correcting Codes 

Thus far we have considered delecting or correcting errors that occur independently, or ran¬ 
domly, in digit positions. On some channels, disturbances can wipe out an entire block of 
digits. For instance, a stroke of lightning or a human-made electrical disturbance can affect 
several adjacent transmitted digits. On magnetic storage systems, magnetic material defects 
usually affect a block of digits. Burst errors are those that w ipe out some or all of a sequential 
set of digits. In general, random error correcting codes are not efficient for correcting burst 
errors. Hence, special burst error correcting codes are used for this purpose. 

A burst of length h is defined as a sequence of digits in which the first digit and the frth digit 
are definitely in error, with the b — 2 digits in between either in error or correct. For example, 
an eiTor vector e — 0010010100 has a burst length of 6. 

Tt can be shown that for detecting all burst errors of length h or less with a linear block code 
of length n y b parity check bits are necessary and sufficient. 3 We shall prove the sufficiency 
part of this theorem by constructing a code of length n with b parity check digits that will 
detect a burst of length b. 

To construct such a code, let us group k data digits into segments of b digits in length 
(Fig, 14.4). To this we add a last segment of b parity check digits, which are determined as 
follows. The modulo-2 sum of the ith digits from each segment (including the parity check 
segment) must be zero. For example, the first digits in the five data segments are 1 , 0 , 1 , 1 , 
and 1 . Hence, to obtain a modulo-2 sum zero, we must have 0 as the first parity check digit. We 
continue in this way with the second digit, the third digit, and so on, to the hi h digit. Because 
parity check digits are a linear combination of data digits, this is a linear block code. Moreover, 
it is a systematic code. 

It is easy to see that if a digit sequence of length b or less is in error, parity will be violated 
and the error will be detected (but not corrected), and the receiver can request retransmission 
of the digits lost. One of the interesting properties of this code is that b, the number of parity 


Figure 14.4 

Burst error 
detection. 

—■—b digits 
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l 
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14.6 Convolutional Codes 827 


check digits, is independent of k (or n ), which makes it a very useful code for such systems as 
packet switching, where the data digits may vary from packet to packet. It can be shown that 
a linear code with b parity bits detects not only all bursts of length b or less, but also a high 
percentage of longer bursts. 3 

If we are interested in correcting rather than detecting burst errors, we require twice 
as many parity check digits. According to the Hamming sphere reasoning; to correct all 
burst errors of length b or less, a linear block code must have at least 2b parity-check 
digits } 


14.6 CONVOLUTIONAL CODES 


Convolutional (or recurrent) codes, introduced in 1955, 4 differ from block codes as follows. 
In a block code, the block of n code digits generated by the encoder in any particular time unit 
depends only on the block of k input data digits within that time unit. In a convolutional code, 
on the other hand, the block of n code digits generated by the encoder in a particular time unit 
depends not only on the block of k message digits within that time unit but also on the data 
digits within a previous span of N - 1 time units (N > 1). For convolutional codes, k and 
n are usually small. Convolutional codes can be devised for correcting random errors, burst 
errors, or both. Encoding is easily implemented by shift registers. As a class, convolutional 
codes are easier to encode. 

14.6.1 Convolutional Encoder 

A convolutional encoder with constraint length N consists of an ACstage shift register and t 
modulo-2 adders. Figure 14,5 shows such an encoder for the case of N = 3 and i = 2. The 
message digits are applied at the input of the shift register. The coded digit stream is obtained 
at the commutator output. The commutator samples the £ modulo-2 adders in sequence, once 
during each input-bit interval. We shall explain this operation with reference to the input 
digits 11010. 

Initially, all the contents of the register are 0. At time k = 1, the first data digit 1 enters 
the register. The content d k shows 1 and all the other contents d k -\ = 0 and d k -i = 0 are 
still unchanged. The two modulo-2 adders show encoder output i = 1 and v k>2 = 1 for this 
data input. The commutator samples this output. Hence, the encoder output is 11. At k = 2, 


Output stream 
' Ht— 1,2 v k, 1 ' 



Figure 14.5 

Convolutional 

encoder. 


Input bit stream 
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the second message bit 1 enters the register. It enters the register stage d*, and the previous 1 
in d\ is now shifted to *4-1 * where as 2 is still 0. The modulo-2 adders now show Vfc f i =0 
and v* ( 2 = 1- Hence, the encoder output is 01, In the same way, when the new' digit 0 enters 
the register, we have dk = 0, dk~\ = 1, and dk -2 = 1. and the encoder output is OF 

Observe that each data digit influences N groups of l digits in the output (in this case 
three groups of two digits). The process continues until the last data digit enters the stage 
d%.* We cannot stop here, however. We add N - 1 number of Os to the input stream (dummy 
or augmented data) to make sure that the last data digit (0 in this case) proceeds all the way 
through the shift register, to influence the N groups of v digits. Hence, w hen the input digits are 
11010, we actually apply (from left to right) 1101000, which contains N — 1 augmented zeros 
to the input of the shift register. It can be seen that w hen the last digit of the augmented message 
stream enters dk, the last digit of the message stream has passed through all the N stages of 
the register. The reader can verify that the encoder output is given by 11010100101100. Thus, 
there are in all n = (N + k - 1 )l digits in the coded output for every k data digits. In practice, 
k '£> /V, and, hence, there are approximately kl coded output digits for every k data digits, 
giving an rate r} ^ 1 jO 

It can be seen that unlike the block encoder, the convolutional encoder operates on a 
continuous basis, and each data digit influences N groups of i digits in the output. 


Code Tree 

The process of coding and decoding is considerably facilitated by what is known as the code 
tree, which show s the coded output for any possible sequence of data digits. The code tree for 
the encoder in Fig. 14.5 with k — 5 is shown in Fig. 14.6. When the first digit is 0, the encoder 
output is 00, and when it is 1, the output is 11. This is shown by the two tree branches that 
start at the initial node. The upper branch represents 0, and the lower branch represents 1. This 
convention will be followed throughout. At the terminal node of each of the two branches, 
we follow a similar procedure, corresponding to the second data digit. Hence, two branches 
initiate from each node, the upper one for 0 and the lower one for 1. This continues until the 
&th data digit. From there on, all the input digits are 0 (augmented digit), and we have only one 
branch until the end. Hence, in all there are 32 (or 2 k ) outputs corresponding to 2 k possible data 
vectors. The coded output for input 11010 can be easily read from this tree (the path shown 
dashed in Fig. 14.6). 

Figure 14.6 shows that the code tree becomes repetitive after the third branch. This can 
be seen by noting that the two blocks enclosed in the dashed lines are identical. It means that 
the output from the fourth input digit is the same whether the first digit was 1 or 0. This is 
not surprising, since when the fourth input digit enters the shift register, the first input digit 
is shifted out of the register and ceases to influence the output digits. In other words, the data 
vector 1 tia- 2 .* 3*4 ♦ - ■ and the data vector OA 1 A 2 A 3 A 4 ■ ■- generate the same output after the third 
group of output digits. It is convenient to label the four third-level nodes (the nodes appearing 
at the beginning of the third branch) as nodes a , h , c, and d (Fig. 14.6). The repetitive structure 
begins at the fourth-level nodes and continues at the fifth-level nodes, whose behavior is similar 
to that of nodes a , h , c, and d at the third level. Hence, w f e label the fourth- and fifth-level 
nodes also as either a , b , c, or d. What this means is that at the fifth-level nodes, the first two 
data digits have become irrelevant; that is, any of the four combinations (11,10,01, or 00) for 
the first two data digits will give the same output after the fifth node. 


* For a systematic code, one of the output digits must be the data digit itself. 

t In general, instead of shifting one digit at a time, b digits may be shifted at a time. In this case r\ ^ bji. 
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Figure 14.6 

Code tree for the 
encoder in 
Fig, 14.5, 
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State Transition Diagram Representation 

The encoder behavior can be seen from the perspective of a finite state machine with its 
state transition diagram. When a data bit enters the shift register (in d k ), the output bits are 
determined not only by the data hit in d k , but by the two previous data bits already in stages 
dk-2 and <4_|. There are four possible combinations of the two previous bits (in <4-2 and 
dk- 1 ): 00,01,10, and 11. We shall label these four states a , b , c\ and d , respectively, as shown 
in Fig, 14.7a. Thus, when the previous two bits are 01 (*4-2 = 0, <4_j — 1), the state is b , 
and so on. The number of states is equal to 2 jV " _l . 

A data bit 0 or 1 generates four different outputs, depending on the encoder state. If the 
data bit is 0 , the encoder output is 00 , 10 ,11 , or 01 , depending on whether the encoder state is 
a, b , c, or d. Similarly if the data bit is 1, the encoder output is 11,01,00, or 10, depending on 
whether the encoder state is a , b y c\ or d . This entire behavior can be concisely expressed by 
the state transition diagram (Fig. 14.7b), a four-state directed graph that uniquely represents 
the input-output relation of this encoder. We label each transition path with a label of input bit 
over output bits: 


{dk)I{n,\ ni) 

This way, we know exactly the input information bit d k for each state transition and its 
corresponding encoder output bits. [v k i i^ 2 }- 

For instance, when the encoder is in state a , and we input 1, the encoder output is 11. Thus 
the transition path is labeled with 1/11. The encoder now goes to state h for the next data bit 
because at this point the previous two bits become d k _ 2 = 0 and d k -\ = 1. Similarly, when 
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Figure 14.7 

(a) Stare and 

(b) state transi¬ 
tion diagram of 
the encoder in 
Fig, 14.5. 


State 

labels 


State 


k-2 a k-[\ 


Input data 


a: 0 0 


b\ 0 1 

c\ \ 0 

d\ 1 1 


(a) 



Figure 14.8 

Trellis diagram 
for the encoder 
in Fig. 14.5. 

a. 00 


b: [0^ 


r: [jo] 


d: [TT] 


0/00 


Branch labels: input bits / output bits 
0/00 0/00 0/00 0/00 


0/00 



00 


QlI 

□H 


the encoder is in state a and the input is 0, the output is 00 (solid line), and the encoder remains 
in state a. Note that the encoder cannot go directly from state a to states cord. From any given 
state, the encoder can go to only two states directly by inputting a single data bit. This is an 
extremely important observation, which will be used later. The encoder goes from state a to 
state b (when the input is 1), or to state a {when the input is 0), and so on. The encoder cannot 
go from a to c in one step. It must go from a to b to c, or from a to b to d to t, and so on* We 
can also verify these facts from the code tree* Figure 14.7b contains the complete information 
of the code tree. 


Trellis Diagram 

Another useful way of representing the code tree is the trellis diagram (Fig, 14*8). The diagram 
starts from scratch {all 0s in the shift register, i.e*, state a) and makes transitions corresponding 
to each input data digit. These transition branches are labeled just as we labeled the state 
transition diagram* Thus, when the first input digit is 0, the encoder output is 00, and the trellis 
branch is labeled 0/00. This is readily seen from Fig. 14,7b. We continue this way with the 
second input digit. After the firsttwo input digits, the encoder is in one of the four states a, b, c, or 
d , as shown in Fig. 14.8* If the encoder is in states (previous two data digits 00), it goes to state 
b if the next input bit is 1 or remains in state a if the next input bit is 0* In so doing, the encoder 
output is 11 (a to b) or 00 (a to a )* Note that the structure of the trellis diagram is completely 
repetitive, as expected, and can be readily drawn by using the state diagram in Fig. 14.7b. 
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Figure 14.9 

A recursive 
system otic 
convolutional 
(RSC) encoder. 



It should be noted that the convolutional encoder can have feedback branches. In fact, 
feedback in the convolutional encoder generates the so-called recursive code. As shown in 
Fig. 14.9, the data bit can have a direct path to the output bit. The bits from the top branch will 
be the information bits from the input directly. This code is therefore systematic. This encoder 
leads to a recursive systematic convolutional (RSC) code. It can be shown (see Prob. 14.5-3) 
that the RSC encoder can also be represented by a similar state transition diagram and a trellis 
diagram. Consequently, recursive convolutional code can he decoded by using the methods 
described next for nonrecursive convolutional codes. 


14.6.2 Decoding Convolutional Codes 

We shall consider two important techniques: (1) maximum likelihood decoding (Viterbi algo¬ 
rithm) and (2) sequential decoding. Although both are known as hard-decision decoders, the 
Viterbi algorithm (VA) is much more flexible and can be easily adapted to allow soft input and 
to generate soft output decisions, to be described later in this chapter 

Maximum Likelihood Decoding: The Viterbi Algorithm 

Among various decoding methods for convolutional codes, Viterbi’s maximum likelihood algo¬ 
rithm 3 is one of the best techniques for digital communications when computational complexity 
dominatesinimportance.lt permits major equipment simplification while obtaining the full per¬ 
formance benefits of maximum likelihood decoding. The decoder structure is relatively simple 
for short constraint length V, making decoding feasible at relatively high rates of up to 10 Gbit/s. 

In AWGN channels, the maximum likelihood receiver requires selecting a codeword clos¬ 
est to the received word. For a long sequence of received data representing k message bits 
and 2 k codewords, direct implementation of maximum likelihood decision (MLD) involves 
storage of 2 k words and their comparison to the received sequence. This computational need 
places a severe burden on MLD receivers for large values of k for convolutional ly encoded 
data frames, typically in the order of hundreds or thousands of bits! 

Viterbi made a major simplification for MLD. We shall use the convolutional code example 
of Figs. 14.5 and 14.7 to illustrate the fundamental operations of the VA. First, we stress that 
each path that traverses through the trellis represents a valid codeword. The objective of MLD 
is to find the best path through the trellis that is closest to the received data bit sequence. To 
understand this, consider again the trellis diagram in Fig. 14.8. Our problem is as follows: 
given a received sequence of bits, we need to find a path in the trellis diagram with the output 
digit sequence that agrees best with the received sequence. The minimum (Hamming) distance 
path represents the most likely sequence up to stage f. 



832 ERROR CORRECTING CODES 


As shown in Fig. 14.8, each codeword is a trellis path that should start from state a (00). 
Because every path at stage i must grow out of the paths at stage / - 1, the optimum path to 
each state at stage i must contain one of the best paths to each of the four states at stage i — l* 
Tn short, the optimum path to each state at stage i is a descendant of the predecessors at stage 
/ - 1. All optimum paths at any stage i 4- fa are descendants of the optimum path at stage i * 
Hence, only the best path to each state need be stored at a given stage. There is no reason to 
store anything but the optimum path to each state at every stage because nonoptimum paths 
would only increase the metric of path distance to the received data sequence. 

In the special example of Fig* 14.7, its trellis diagram (Fig. 14.8) shows that each of the 
four states (a,b,c, and d) has only two predecessors; that is, each slate can be reached only 
through two previous states* More importantly, since only the four best surviving paths (one 
for each state) exist at stage i — 1, there are only two possible paths for each state at stage i. 
Hence, by comparing the total Hamming distances (from the received sequence) of the two 
paths, we can find the optimum path with the minimum Hamming distance for every state at 
stage i that corresponds to a codeword which is closest to the received sequence up to stage i. 
The optimum path to each state is known as the survivor or the surviving path. 


Example 14*7 We now study a decoding example of the Viterbi algorithm for maximum likelihood decoding 
of the convolutional code generated by the encoder of Fig. 14*5* Let the first 12 received digits 
be 01 10 11 00 00 00 , as shown in Fig. 14*10a-e. Showing the received digits along with the 
branch output bits makes it easier to computer the branch Hamming distance in each stage. 

t§ We start from the initial state of a (00). Every stage of the decoding process is to find the 

I optimum path to the four states given the 2 received bits during the stage. There are two 
possible states leading to each state in any given state* The survivor with the minimum 
Hamming distance is retained (solid line), whereas the other path with larger distance is 
discarded (dashed line)* The Hamming distance of each surviving path is labeled at the 
end of a stage to each of the four states* 



After two stages, there is exactly one optimum (surviving) path to each state (Fig. 14* 10a)* 
The Hamming distances of the surviving paths are labeled as 2,2,1, and 3, respectively* 

Each state at stage 3 has two possible paths (Fig* 14.10b). We keep the optimum path 
with the minimum distance (solid line)* The distances of the two possible paths (from 
top to bottom) arriving at each state are given in the minimization label. For example, 
for state a , the first path (dashed line from a) has Hamming distance of 2 -b 2 = 4, 
whereas the second path (solid line from c) has the distance of 1 + 0 = 1. 

Repeat the same step for stages 4, 5, and 6, as illustrated in Fig* 14.10c-e. 

The final optimum path after stage 6 is identified as the shaded solid path with minimum 
distance of 2 ending in state a, as shown in Fig* 14* 10(e)* Thus, the MLD output should 
be 

Codeword: 111011000000 (14.32a) 

Information bits: 1 0 0 0 0 0 (14.32b) 


( Note that there are only four contending paths (the four survivors at states a y b, t\ and d) 
until the end of stage 6. All four paths merged up till stage 3* This means that the first three 
branch selections are the most reliable* in fact, continuing the VA when given additional 
received bits will not change the first three branches and their associated decoder outputs. 
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Figure 14.10 

Viterbi decoding 
example in 
Fig. 14,5: 

(a) stages 1 and 
2; (b) stage 3; 

(c) stage 4; 

(d) stage 5; 

(e) stage 6. 


Received bits: 01 10 11 


10 00 00 
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Figure 14,10 Received bits: 01 1U _U__QQ_GO. 

Continued 



Keceived hits: 01 _10_ ] 1 Ifi _ 00 00 _ Optimal palb 



In the preceding example, we have illustrated how to progress from one stage to the next 
by determining the optimum path (survivor) leading to each of the state. When these survivors 
do merge, the merged branches represent the most reliable MLD outputs. For the later stages 
that do not exhibit a merged path, we are ready to make a maximum likelihood decision based 
on the received data bits up to that stage. This process, known as truncation, is designed to a 
force a decision on one path among all the survivors without leading to a long decoding delay. 
One way to make a truncated decision is to take the minimum distance path as in Eq. (14*32). 
Another alternative is to rely on extra codeword information. In Fig. 14.10e, if the encoder 
always forces the last two data digits to be 00, then we can consider only the survivor ending at 
state a. 

With the Viterbi algorithm, storage and computational complexity are proportional to 2 N ~ ! 
and are very attractive for constraint length N < 10. To achieve very low error probabilities, 
longer constraint lengths are required, and sequential decoding (to be discussed next) may 
become attractive* 

Sequential Decoding 

In sequential decoding, a technique proposed by Wozencraft, the complexity of the decoder 
increases linearly rather than exponentially. To explain this technique, let us consider an encoder 
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with N = 4 and l — 3 (Fig, 14-11), The code tree for this encoder is shown in Fig. 14.12. 
Each data digit generates three {£ = 3) output digits but affects four groups of three digits 
(12 digits) in all. 

In this decoding scheme, we observe only three (or t ) digits at a time to make a tentative 
decision, with readiness to change our decision if it creates difficulties later. A sequential 
detector acts much like a driver who occasionally makes a wrong choice at a fork in the road, 
but quickly discovers the error (because of road signs), goes back, and takes the other path. 

Applying this insight to our decoding problem, the analogous procedure would be as 
follows. We look at the first three received digits. There are only two paths of three digits from 
the initial node n \. We choose that path whose sequence is at the shortest Hamming distance 
from the first three received digits. Wc thus progress to the most likely node. From this node 
there are two paths of three digits. We look at the second group of the three received digits 
and choose that path whose sequence is closest to these received digits. We progress this way 
until the fourth node. If we were unlucky enough to have a large number of errors in a certain 
received group of i digits, we will take a wrong turn, and from there on we will find it more 
difficult to match the received digits with those along the paths available from the wrong node. 
This is the due to the realization that an error has been made. Let us explain this by an example. 

Suppose a data sequence 11010 is encoded by the encoder in Fig, 14*11* Because N = 4, 
we add three dummy Os to this sequence so that the augmented data sequence is 11010000. 
The coded sequence will he (see the code tree in Fig. 14.12) 

111 101 001 111 001 011 011 000 
Let the received sequence be 

101 011 001 111 001 011 011 000 

There are three bit errors: one in the first group and two in the second group. We start at the 
initial node n\ . The first received group 101 (one error) being closer to 111, we make a correct 
decision to go to node m- But the second group 001 (two errors) is closer to 010 than to 101 
and will lead us to the wrong node rather than to . From here on we are on the wrong track, 
and, hence, the received digits will not match any path starting from n f 3 . The third received 
group is 001 and does not match any sequence starting at n 3 (viz,, 001 and 100). But it is closer 
to 011, Hence, we go to node ri A * Here again the fourth received group 111 does not match 
any group starting at n' 4 (viz.. Oil and 100). But it is closer to 011. This takes us to node ny 
It can be seen that the Hamming distance between the sequence of 12 digits along the path 
n\n 2 n^n^n f 5 and the first 12 received digits is 4, indicating four errors in 12 digits (if our path 
is correct). Such a high number of errors should immediately make us suspicious* If P € is the 
digit error probability, then the expected number of errors n e in d digits is P e d * Because P e is 
on the order of 10 -4 to 10 -6 , four errors in 12 digits is unreasonable. Hence, we go back to 
node ri 3 and try the lower branch, leading to n f y This path, is even worse than the 

previous one: it gives five errors in 12 digits. Hence, we go back even farther to node ri 2 and 
try the path leading to and farther. We find the path U] giving three errors. If we 

go back still farther to n[ and try alternate paths, we find that none yields less than five errors. 
Thus, the correct path is taken as giving three errors* This enables us to decode 

the first transmitted digit as 1. Next, we start at node n 2 , discard the first three received digits, 
and repeat the procedure to decode the second transmitted digit. We repeat this until all the 
digits have been decoded. 

The next important question concerns the criterion for deciding when the wrong path is 
chosen. The plot of the expected number of errors n e as a function of the number of decoded 
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Figure 14.13 

Setting the 
threshold in 
sequential 
decoding. 



digits d is a straight line (/i f — with slope P e , as shown in Fig. 14.13. The actual number 
of errors along the path is also plotted. If the errors remain within a limit (the discard level), 
the decoding continues. If at some point the errors exceed the discard level, we go back to the 
nearest decision node and try an alternate path. If errors still increase beyond the discard level, 
we then go back one more node along the path and try an alternate path. The process continues 
until the errors are within the set limit. By making the discard level very stringent (close to the 
expected error curve), we reduce the average number of computations. On the other hand, if the 
discard level is made too stringent, the decoder will discard all possible paths in some extremely 
rare cases of an unusually large number of errors due to noise. This difficulty is usually resolved 
by starting with a stringent discard level. If on rare occasions the decoder rejects all paths, the 
discard level can be relaxed little by little until one of the paths is acceptable. 

It can be shown that the error probability in this scheme decreases exponentially as N, 
whereas the system complexity grows only linearly with k. The code rate is r\ ^ \jt. It can be 
shown that for ij < the average number of incorrect branches searched per decoded digit 
is bounded, whereas tor j] > it is not; hence r \ 0 is called the computational cutoff rate. 

There are several disadvantages to sequential decoding: 

1. The number of incorrect path branches, and consequently the computation complexity, is a 
random variable depending on the channel noise. 

2. To make storage requirements easier, the decoding speed has to be maintained at 10 to 20 
times faster than the incoming data rate. This limits the maximum data rate capability. 

3. The average number of branches can occasionally become very large and may result in a 
storage overflow, causing relatively long sequences to be erased. 

A third technique for decoding convolutional codes is feedback decoding, with thresh¬ 
old decoding 6 as a subclass. Threshold decoders are easily implemented. Their performance, 
however, does not compare favorably with the previous two methods. 


14.7 TRELLIS DIAGRAM OF BLOCK CODES 

Whereas a trellis diagram is connected with convolutional code in a direct and simple way, a 
syndrome trellis can also be constructed for a binary linear (/?, k) block code according to its 
parity check matrix 7 H The construction can be stated as follows: 


Let (ci, C 2 . c n ) be a codeword of the block code. 

LztH = \ji\h 2 ■ ■ ■ h n J be the (n - k) x n parity check matrix with columns {h,} 
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• There are min(2 i: , 2" *) possible states in the trellis, 

• The state of a codeword at instant i is determined by the codeword and the parity check 
matrix by syndrome from the first codeword bit to the /th codeword bit: 

Zi = c\hf © c-ihi © ■ ■ ■ © Cjhi. (14.33) 

Note that this syndrome trellis, unlike the state transition trellis of convolutional code, is 
typically nonrepeating. In fact, it always starts from the “zero" state and ends in “zero” state. 
Indeed, this trellis is a time-varying trellis. We use an example to illustrate the construction 
of a syndrome trellis. 


Example 14.8 


Consider a Hamming (7,4, 3) code with parity check matrix 


ti = 


1110 10 0 
0 1110 10 
110 10 0 1 


Sketch the trellis diagram for this block code. 


(14.34) 


Figure 14.14 

Trellis diagram of 
a Hamming 
(7, 4, 3) code 
with parity check 
matrix of 
Eq. (14.34). 


For this code, there are 3 error syndrome bits defining a total of 2 3 = 8 states. Denote the 
eight states as (Sq, Si, S2 , S3, S4, Ss, S7). There are 2 k = 2 4 = 16 total codewords 
with 7 code bits that are in the null-space of the parity check matrix H . By enumerating 
all 16 codewords, we can follow Eq. (14.33) to determine all the paths through the trellis. 

The corresponding time-varying trellis diagram is shown in Fig. 14.14. Notice that 
each path corresponds to a codeword. We always start from state Sq initially and end at 
the state So- Unlike the case of convolutional code, it is not necessary to label the trellis 
branches in this case. Whenever there is a state transition between different states, the 
branch automatically corresponds to a “1 ,? code bit. When a state stays the same, then the 
transition branch corresponds to a “0” code bit. 


Code bits 


Cl c 2 Cj C 4 c 5 Cfy c 7 



Once we have a trellis diagram, the Viterbi decoding algorithm can be implemented for the 
MLD of the block code at reduced complexity. Maximum likelihood detection of block codes 
can perform better than a syndrome-based decoder. Keep in mind that the example we show 
is a very short code that does not benefit from Viterbi decoding. Clearly, the Viterbi algorithm 
makes more sense when one is decoding a long code. 
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14.8 CODE COMBINING AND INTERLEAVING 

Simple and short codes can be combined in various ways to generate longer or more powerful 
codes. Certainly there are many possible ways of combining multiple codes. In this section, 
we briefly describe several of the most common methods of code construction through code 
combining. 


Interleaving Codes for Correcting Burst and Random Errors 

One ol the simplest and yet most effective tools for code combining is interleaving, the process 
of reordering or shufiling (multiple) codewords generated by the encoder. Thus, a burst of bit 
errors will be affecting multiple codewords instead of one. The purpose of interleaving is to 
disperse a large burst of errors over multiple codewords such that each codeword needs to 
correct only a fraction of the error burst. This is because, in general, random error correcting 
codes are designed to tackle sporadic errors in each codeword. Unfortunately, in most practical 
systems, we have errors of both kinds. Among methods proposed to simultaneously correct 
random and burst errors, interleaving is simple and effective. 

For an (n,£) code, if we interleave X codewords, we have what is known as a (Xn, Xk ) 
interleaved code. Instead of transmitting codewords one by one, we group X codewords and 
interleave them. Consider, for example, the case of X — 3 and a two-error correcting (15, 
8 ) code. Each codeword has 15 digits. We group codewords to be transmitted in groups of 
three. Suppose the first three code words to be transmitted are jr = (jq, jq, ..., jcj 5 ),^ = 

0 7 i f yi .> 15 ), and z = (z\, Z 2 , ■ ■ ■, Z[ 5 )* respectively. Then instead of transmitting xyz 

in sequence as q, r 2 ,-q.s, y |, v 2 , ... ,>' 15 * zi, Z 2 * ♦ > ■ ,zi 5 , we transmit jq, y 1? x 2 , y 2 , 

- 2 * a" 3 ? > 3 , ^ 3 , .. - yi 5 , Z]5- This can be explained graphically by Fig. 14.15, where X 
codewords (three in this case) are arranged in rows. In normal transmission, we transmit one 
row after another. In the interleaved case, we transmit columns (of X elements) in sequence. 
When all the 15 (n) columns are transmitted, we repeat the procedure for the next X codewords 
to be transmitted. 

To explain the error correcting capabilities of this interleaved code, we observe that the 
decoder will first remove the interleaving and regroup the received digits as jq, X 2 , .. ., xi$, 
yi, > 2 ^ ■ ■ ■ 1 >'i 5 . z\« Z 2 * ■ - , Zi5- Suppose the digits in the shaded boxes in Fig. 14.15 were in 
error. Because the code is a two-error correcting code, up to two errors in each row will be 
corrected. Hence, all the errors in Fig. 14,15 are correctable. We see that there are two random, 
or independent, errors and one burst of length 4 in all the 45 digits transmitted. In general, if 
the original ( n , k) code is /-error correcting, the interleaved code can correct any combination 
of t bursts of length X or less. 

Because of the interleaver described in Fig. 14.15 takes a block bits and generates output 
sequence in a fixed orderly way, interleavers of this kind are known as block interleavers. 
The total memory length of the interleaver is known as the interleaving depth. Interleavers 


Figure 14,15 

A block 
(nonrandom) 
interleaver for 
correcting 
random and 
burst errors. 


Output 


yt 






n 


vs 


*3 


— 


— 

-v ]4 

— 

*15 




>34 


y\s 




~14 


z \5 


iuput 

<1 - 


input y 
< - 


input z 
<3- 



840 ERROR CORRECTING CODES 


Figure 14*16 
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with larger depths can better handle longer bursts of errors, at the cost of huger memory 
and longer encoding and decoding delays. A more general interleaver can pseudorandomly 
reorder the data bits inside the interleaver and output the bits in an order known to both the 
transmitter and the receiver. Such an interleaver is known as a random interleaver. Ran¬ 
dom interleavers are generally more effective in combating both random and burst errors. 
Because they do not generate outputs following a fixed order, there is a much smaller prob¬ 
ability of receiving a burst of error bits in a codeword because of certain random error 
patterns. 

Product Code 

Interleaved code can be generalized by further encoding the interleaved codewords. The result¬ 
ing code can be viewed as a large codeword that must satisfy two parity checks (or constraints). 
Figure 14.16 illustrates how to form a product code from two systematic block codes that are 
known as component codes. The first is an («i, k\) code and the second is an { n 2 , k 2 ) code. 
More specifically, a rectangular block of &i x k 2 message bits is encoded by two encoders. First, 
k 2 blocks of k[ message bits is encoded by the first encoder into k 2 codewords of the {n\ > k\ ) 
code. Then an n\ x k 2 block interleaver sends rci blocks of £2 bits into the second encoder. 
The second {n 2 , £ 2 ) encoder adds n 2 - k 2 parity bits for each of the n\ blocks, generating n\ 
codewords of the ( n 2 , k 2 ) code for the channel to transmit. 

The use of a product code is a simple way to combine two block codes into a single more 
powerful code. In a product code, every code bit is constrained by two sets of parities, one 
from each of the two codes. 

Concatenated Codes 

Note from the block diagram of the product code that a block interleaver connects the two 
component codes. More generally, as shown in Fig. 14.17, the two component codes need not 
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Figure 14,17 
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be limited to binary block codes, and a more general interleaver n can be used. The resulting 
code is known as a concatenated code. Indeed, Forney 8 proposed concatenating one binary 
and one nonbinary code to construct a much more powerful code, it is dear that product codes 
are a special class of concatenated codes with binary component codes and a block interleaver. 

In this serial concatenation, encoder 2 is known as the inner code whereas encoder 1 is 
known as the outer code. The most successful concatenation as proposed by Forney 8 uses a 
Reed-Solomon outer code and a binary convolutional inner code. The concatenated code can 
be decoded separately by first decoding the inner code before de-interleaving and decoding the 
outer code. More complex ways of iterative decoding are also possible to potentially achieve 
better performance. 


14.9 SOFT DECODING 

Thus far, we have focused on decoding methods that generate hard decisions based on either 
maximum likelihood or syndrome-based algebraic decoding. Hard-decision decoding refer to 
the fact that the decoder generates only the most likely codeword without providing the relative 
confidence of this decoded codeword with respect to other possibilities. In other words, the 
hard-decision decoded codeword does not indicate how confident the decoder is about this 
decision. A stand-alone decoder can function as a hard-decision decoder because its goal is to 
provide the best candidate as the decoded codeword. It does not have to indicate how much 
confidence can be placed in this decision. 

In practice, however, a decoder is often operating in conjunction with other decoders and 
other receiver units. This means that the decoded codeword not only must meet the constraint 
of the current parity check matrix, its output must also satisfy other constraints such as those 
imposed by the parities of different component codes in a concatenated error correction code. 
By providing more than just one hard decision, a soft-dec is ion decoder can output multiple 
possible codewords, each with an associated reliability (likelihood) metric. This kind of soft 
decoding can allow other units in the receiver to jointly select the best codeword by utilizing 
the “soft” (reliability) information from the decoder along with other relevant constraints that 
the codeword must satisfy. 

It Is more convenient to illustrate the soft decoding concept by means of a BPSK modu¬ 
lation example. Let us revisit the optimum receiver of Sec. 10,6. We will focus on the special 
case ot binary modulation with modulated data symbol represented by bi — ±1 under additive 
white Gaussian noise channel. Let cjj denote the /th code bit of the y'th codeword cj . Because the 
modulation is BPSK, the relationship between the code bit cjj and its corresponding modulated 
symbol bjj is simply 


bjj — 2 ■ Cjj — 1 


Assuming that the receiver filter output signal is 1SI free, then the received signal samples 
F corresponding to the transmission of the n -bit (n, k) codeword [cj j c j2 - ■ ■ cj^\ can be 



842 


ERROR CORRECTING CODES 


written as 


r/ = ^/E h hjj + w/ i — L 2, * *., n (1435) 

Here w,- is an AW ON sample. We use C to denote the collection of all valid codewords. Based 
on the optimum receiver of See. 10.6 [Eq.( 10.91) and Fig. 10.18], the maximum likelihood 
decoder (MLD) of the received signal under coding corresponds to 

c — arg max r{bjj 

Cj 6 c 

= arg max ^ n(2cj t i - 1) 

= arg max 2 > r/e,— ) n 

CjGC ^ ^ 

= arg max ricn (14.36) 

Among all the 2 k codewords, the soft MLD not only can determine the most likely codeword 
as the output, it should also preserve the metric 

M } = Y, r ‘ b 3’ ! 

as the relative likelihood of the codeword c, during the decoding process. Although equiva¬ 
lent to the distance measure, this (correlation) metric should be maximized for MLD. Unlike 
distance, the correlation metric can be both positive and negative. 

Although the soft MLD appears to be a straightforward algorithm to implement, its com¬ 
putational complexity is affected by the size of the code. Indeed, when the code is long with 
a very large k , the computational complexity grows exponentially because 2 k metrics must 
be calculated. For many practical block codes, this requirement becomes unmanageable when 
the code length exceeds several hundred bits. 

To simplify this optimum decoder, Chase proposed several types of suboptimum soft 
decoding algorithms 9 that are effective at significantly reduced computational cost. The first 
step of the Chase algorithms is to derive temporary hard bit decisions based on the received 
samples t These temporary bits do not necessarily form a codeword. In other words, find 

y = \yi >2 *-■ ynl (14.37a) 

where 

y, = sign (r,) i — 1, 2, ..., n (J4.37b) 

Each individual bit decision has reliability |r;|. These temporary bits {y r } are sent to an algebraic 
decoder based on, for example, error syndromes. The result is an initial codeword cq = 
[c 0 ,i cq ,2 * ■- co,n]. This step is exactly the same as a conventional hard-decision decoder. 
However, Chase algorithms allow additional modifications to the hard decoder input y by 
flipping the least reliable bits. Flipping means changing a code bit from 1 to 0 and from 0 to 1. 

The idea of soft decoding is to provide multiple candidate codewords, each with an asso¬ 
ciated reliability measure. Chase algorithms generate most likely flip patterns to be used to 
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modify the hard decoder input y. Each flip pattern ei consists of Is in bit positions to be flipped 
and Os in the remaining bit positions* For each flip pattern e *, construct 

Ci = hard decision (y,) © a (14.38a) 

and compute the corresponding reliability metric 


n 

Mi = - *) (15.38b) 

j=l 

The codeword with the maximum M; is the decoded output. 

There are three types of Chase algorithm. First, we sort the bit reliability from low to high: 

ki,l < lr/,1 < - < k/J (14.39) 


Typel Test all flipping patterns of weight less than or equal to ((/ min - 1). 

Type 2 Identify the [dmin/2J least reliable bit positions { n !2 * ■ ■ 2 }• Test all flipping 

patterns of weight less than or equal to |/4iin/2 - 1J.* 

Type 3 Test flipping patterns of weight w = 1 , 3. d mm - 1 by placing Is in the w least 

reliable bit positions. 


The block diagram of Chase algorithms is shown in Fig* 14.18. The three Chase algorithms 
differ only in how the flipping patterns are generated* In addition, we should note that Chase 
decoders ean exchange reliability and likelihood information with other receiver units in a joint 
effort to improve the decoding performance. From the input end, the set of flipping patterns can 
take additional suggestions from other receiver units. From the output end, multiple codeword 
candidates, along with their reliability metrics, can be sent to additional decoding units for 
further processing and eventual elimination* 


Figure 14.18 
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The operation |_-J is often known as the “floor/’ Tn particular, |_-tJ represents the largest integer less than or 
equal to x. 
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14.10 SOFT-OUTPUT VITERBI ALGORITHM (SOVA) 

Chase algorithms can generate multiple candidate codewords and the associated reliability 
metrics. The metric information can be exploited by other receiver processing units to deter¬ 
mine the final decoded codeword. If the decoder can produce soft reliability information on 
every decoded bit, then it can be much better utilized jointly with other soft-output decoders and 
processors. Unlike Chase algorithms, soft-output Viterbi algorithms (SOVA) 10 and the maxi¬ 
mum a posterior :; (MAP) algorithms are two most general soft decoding methods to produce 
bit reliability information. We first describe the principles of SOVA here. 

The most reliable and informative soft bit information is the log-likelihood ratio (LLR) 
of a particular code bit a based on the received signal vector 

r = (ru n* ■ ■ ■ i r n ) 

In other words, the LLR 11 as defined by 


A (Cf) = log 


P[ci = Hr — r] 


P[Q = 0|r = r] 


(14.40) 


indicates the degree of certainty by the decoder on the decision of Ci = 1. The degree of 
certainty various from — oc when P — 0|r] = 1 to -hoc when P — 0|r] = 0 

Once again, we consider the BPSK case in which (2c; — 1) = =b 1 is the transmitted 
data and 


r; = (2ci - 1) + w,- i = 1, 2, ..., n (14.41) 

where w; is the AWGN sample. Similar to the Chase algorithms, the path metric is computed 
by the correlation between {r,-} and the BPSK signal {e,-}. In other words, based on the received 
data samples {r;}, we can estimate 


}l 2 

path metric between stages n\ and ^ r,- ■ (2 cj — 1) (14.42) 

jMn + 1 


Like the traditional Viterbi algorithm, the SOVA decoder operates on the corresponding trellis 
of the (convolutional or block) code. SOVA consists of a forward step and a backward step. 
During the forward step, as in the conventional Viterbi algorithm, SOVA first finds the most 
likely sequence (survivor path). Unlike conventional VA, which stores only the surviving path 
metrics at the states in the current stage, SOVA stores the metric of every surviving path leading 
to a state for all stages. 

To formulate the idea formally, denote 

Si(i) = state £ at stage (time) i 

For each survivor at state St in stage z, we will determine the forward path metric leading to this 
state. These forward metrics ending in state £ at time / are denoted as Mg (0- The maximum 
total path metric at the final stateofthe forward VA, denoted M raaJ[> corresponds to the optimum 
forward path. During the backward step, SOVA then applies VA backward from the terminal 
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(final) state at stage K and ends at the initial state at stage 0, also storing the backward metrics 
ending in state t at stage i as 

Since the likely value of the information bit dj = 0, l that leads to the transition between 
state Sijt - 1) and state has been identified by VA during the forward step, the metric 
of information bit Mj(dj) can be fixed as total path metric 


= M m , x 


Our next task is to determine the best path and the corresponding maximum path metric 
Mi{\ - dj) if the opposite information bit value of I - dj is chosen instead at stage i 

M(l-rii)= max [a/{ ($*„(* - D) + (S 4 (j))] (14.43) 

where Bt ut i k is the path distance from state transition l (t to i\, with respect to the received 

sample n. The maximization is over all state transitions denoted by {l a —*£b) that can be 
caused by the information bit value of 1 - di at stage r. 

This step allows us to find the best alternative path through the trellis if the alternative 
bit value 1 — di is selected. Now that we have both M, {d-,) and .V/, ([ — J. \ for every stage i, 
likelihood of every information bit is proportional to the metric difference 


A; = Mi( 1) - Mi (0) = (2di - \)[Mi{di) - A/,(l - di)] 

= (2 di - 1) LM™ - M;{ 1 - d,-)] (14.44) 


Hence, the log-likelihood ratio A; can generated by SOVA for every information bit d.. We 
now can use the survivor path to determine the LLR [Eq. (14.40] for every bit in this most 
likely sequence, The basic concept of finding the best alternative surviving path caused by an 
information bit value of 1 - dj is illustrated in Fig. 14.19. 
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14.11 TURBO CODES 


As we briefly mentioned in Sec. 14.1, turbo codes represent one of the major breakthroughs 12 
in coding theory over the past several decades. The mechanism that made turbo codes possible 
is its simplified decoder. Turbo codes would not have been possible without a soft decoding 
algorithm* In fact, a short paper published more than 30 years earlier by Bahl, Cocke, Jelinek, 
and Raviv 13 played a major role. Their maximum a posterior (MAP) soft decoding algorithm 
is known as the BCJR algorithm. Before describing the essence of turbo codes, we introduce 
the fundamentals of BCJR algorithm. 

BCJR Algorithm for MAP Detection 

Our description of the BCJR MAP algorithm is based on the presentation by Bahl, Cocke, 
Jelinek, and Raviv. 13 We first assume that a sequence of information data bits denoted by 

dj d 2 ■■■ d N (14.45) 

The information bit {d,-} are encoded into codeword bits {v,}, which are further modulated into 
(complex) modulated data symbols {£,}♦ In the general case, we simply note that there is a 
mapping of 


{di } 


Ibi) 


(14,46) 


In the special case of BPSK, hi — ±1, 

The modulated data symbols are transmitted in an i.i.d. noise channel, and the received 
signal samples are 


r,=i>,+w; (14.47) 

in which ware i.i.d. noise samples. Following MATLAB notation, we denote the received 
data 


= fo P a,+i.) 

r = (n, r 2 , .. ■, HV ) 

Because the data symbols and the channel noise are i.i.d., we conclude that the conditional 
probability depends only on the current modulated symbol 


p(n\bh ?w-i) =p(n\bi) (14.48) 

The (convolutional or block) code is represented by a trellis diagram in which 5/ = m 
denotes the event that the trellis state is m at time i. The transition probability between state 
m and m from stage i — 1 to stage i is represented by 

P[$i = m|5j_i = m] 

The definition of the state trellis means that S, is a Markov process.* Based on the properties 
of Markov processes, and the knowledge that r,-_|_h,v and n :i are independent, we have the 


* A random process is a Markov process if its conditional probability satisfies 

l*£—| j'" ^ 1-^-1 j r ’ r ) — P\f. fo l^jt— 1) 

In other words, a Markov process has a very short memory. All the information relevant to x* trom entire its history 
is available in its immediate past j. 
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following simplifications of the conditional probabilities: 


P (?<+] ;.v|S/ ‘ m. Si- 1 = m, ?i ;i ) - p (?,+];*' | Si = m ) 

P Si — = p (n, S, = = m) 


The MAP detector needs to determine the Jog-likelihood ratio 


a Pldi = 
Mdi) - log 




log 


p(di = 1, r) 


m=0[?l *p(di=Q, r) 


(14.49a) 

(14.49b) 


(14.50) 


We are now ready to explain the operations of the BCJR algorithm. First, let fi,-(w) denote 
the set of all possible state transitions from 5/_i = m' to S,- = m when d, = u (u = 0, 1). 
There are only two such sets for d t = 1 and d, = 0. We can see that 

P (di = 1, r ) = p (Si -1 = m, Si = m, ? ) 

(m f , 

= ^2 P (-Si -1 = m, Si = m, h+v.N ) (14.51) 

= P {5'i-l = rn , ?!:/_], Si ■ m, r, ) 

■p(h+l:N |*5(-1 =m, 5; = m) 

Applying Eqs. (14.49a) and (14.49b) to the last equality, we have 

p(di = l,r)= ^ p (S;_i = m, Si = m, r,) -p (?; + i ; jv J5 1 ,- = m) (14.52) 

= P (Sl-i = m '’ h:i~] ) ■ P (Si - m, rj 5)_| = m ) 

■p{h+UN |$ =rn) 

Applying the notations used by Bahl et al ., 13 we define 

cti-\(m') =p(S t -1 =m, ) (14.53a) 

ft(m) =p(r i+ \:N |5« =m) (14.53b) 

Yi(>n, m)=p (Si ~ m, n = m ) (14.53c) 


Given the notations in Eq. (14.53), we can use Eqs. (14.50) to (14.52) to write the LLR of each 
information bit di as 


A (d t ) - log 




(14.54) 
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This provides the soft decoding information for the ith information bit d { , The MAP decoding 
can generate a hard decision simply by taking the sign of the LLR 


u = sign [A)1 

To implement the BCJR algorithm, we apply a forward recursion to obtain 
that is, 


aj(m) =p(Si = m, ?i : /) 

= ^p(Si=rn, St-1 = m\ n ) 

ni ! 

~ ^p(St = m, r; 15j_ i = m\ ) -p (5;_i = m\ ) 

m' 

= Yt(m', m) ■ <xt-\{m) (14.55) 

m 1 


The last equality comes from Eq. (14.49b) t The initial state of the encoder should be So = 0, 
In other words, 


au(m) = P[S 0 = m\ ^ 8[m} ^ jj ™ 

from which the forward recursion can proceed. The backward recursion is for computing 
#_i(m')from 


Pi -] (m) = p ( r l:N |S/_ i = m ) 

= ^ P (S, = m, n, r i+ i;,v |S,_i = m' ) 

m 

= P'-HiAf | 5 (-1 = m ’> s i = m > n, )-p(Si = m, r/[5/_i = m') 

m 

= Y^P ('Vh:/v |5< =»>,)■ Yi(m, m) 

m 

= ^jS,'(m) • Yi(m\ m) (14.56) 

m 


For an encoder with a known terminal state of S/v = 0, we can start the backward recursion 
from 


£v(m) = <5|ml 

from which the backward recursion can be initialized. 

Notice that both the forward and backward recursions depends on the function Yi(m\ m). 
In fact, Yi(m\ m) is already in a simple matrix form. The entry Yi(m\ m) can be simplified and 
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derived from the basic modulation and channel information; 

Yi(m, m) =p(Si = m, n |S,._i = rri) 

= P (n l^i-i = m', Sj = m ) ■ />[,$; = m |S,-_i = m' ] 

= p{n ml) ■ P[di = u] (14.57) 

where a[m' t m] is the codeword from the encoder output corresponding to the state transition 
from m to m, where as = u is the corresponding input bit. To determine y. i m'. m) for dj = u 
according to Eq. (14.57), P[t y c; [m’, m]] is determined by the mapping from encoder output 
Cj[m', ml to the modulated symbol b. and the the channel noise distribution w.-. 

In the special case of the convolutional code in Fig. 14.5, for every data symbol dj, the 
convolutional encoder generates two coded bits f v,-, i , 17 , 2 ). The mapping from the coded bits 
v,-, 2 } to modulated symbol(s) b ; depends on the modulations. In BPSK, then each coded 
bit is mapped to ±1 and bi has two entries 


bi = 


2vj, 1 - 1 
2 t’i, 2 - 1 


If QPSK modulation is applied, then we can use a Gray mapping 

b/ = 

where 

0, {v,,i, > 7 , 2 } = (0, 0} 

jr/2, [v,m, 2 } = (0, 1} 

^ Vi,2} = {1, 1} 

-jr/2, Ki, 17 , 2 } = {1, 0] 

Hence, in a baseband AWGN channel, the received signal sample under QPSK is 

Pi = + Wi (14.58) 

in which w, is the complex, i.i.d. channel noise with probability density function 

'• w= s“ p (t) 


As a result, in this case 


P (f'i m])=p (r; \d s = u) 

= p (n | bi = 

= Pw (n - 

l ( \n-^E s e>*> 

= ^ exp l-v- ) 


(14.59) 
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The BCJR MAP algorithm can compute the LLR of each information bit according to 


A(J,) = log 


T 




a;-\{m ! )p (fi |c/|V, m}) P[d; = 1J/3; (m) 


E 




cK/„](m')p (r,- \ctlm, m]) P[dj = 01/3, (m) 


. m-n,. E^ 

lo S-^73-™ + lQ g 


(m\rti)e£2j(l) 




_PW = 0] 


e, m . 






A ie Hdi) 


(14.60) 


Equation (14.60) shows that the LLR of a given information symbol d\ consists of two parts: 

■ The a priori information from the prior probability of the data symbol d t , which may 
be provided a priori or externally by another decoder. 

• The local information A,- that is specified by the received signals and the code trellis (or 
state transition) constraints. 


With this decomposition view of the LLR, we are now ready to explain the concept of turbo 
codes, or more appropriately, turbo decoding. 


Ttorbo Codes 

The concept of turbo codes was first proposed by Berrou, Glavieux, and Thitimajshima i2 in 
1993 at the annual IEEE International Conference on Communications. The authors' claim of 
near-Shannon-limit error correcting performance w r as initially met with great skepticism. This 
reaction was natural because the proposed turbo code exhibited BER performance within 1 dB 
of the Shannon limit that had been considered to be extremely challenging, if not impossible 
to achieve under reasonable computational complexity. Moreover, the construction of the 
so-called turbo codes does not take a particularly structured form. It took nearly two years for 
the coding community to become convinced of the extraordinary BER performance of turbo 
codes and to begin to understand their principles. Today, turbo codes have permeated many 
aspects of digital communications, often taking specially evolved forms. In this part of the 
section, we provide a brief introduction to the basic principles of turbo codes. 

A block diagram of the first turbo encoder is shown in Fig. 14.20a, This turbo consists of 
two recursive systematic convolutional RSC codes. Representing a unit delay as Z), the 1 x 2 
generator matrix of the rate 1/2 RSC code is of the form 


G(Z>) = [l 




In particular, the example turbo code of Berrou et al. 12 was specified by 


G{D) = 1 


1 + D 2 + D 3 + D 4 
1 + D + D 4 


The simple implementation of the encoder is shown in Fig, 14.20b. 
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Figure 14*20 

Parallel concate¬ 
nated turbo 
code: (a) rate 
1/3 turbo 
encoder; (b) 
implementation 
of recursive 
systematic 
convolutional 
{RSC] encoder 


d- t Information bits 



gi (Dhl+D + D 4 , (a) 

g 2 p;=/+ D 2 +D 3 +D 4 . 


Rate 1/2 
RSC code 


Rate 1/2 
RSC code 


4 d, 



(b) 


In this example, a frame of information bits d, is sent through two RSC encoders. Both 
convolutional codes have rate 1 /2 and are systematic. Thus, the first RSC encoder generates a 
frame of coded bits p) } of length equal to the information frame. Before entering the second 
RSC encoder, the information bits are interleaved by a random block interleaver n. As a result, 
even with the same encoder structure as the first encoder, the second encoder will generate 
a different coded bit frame pj 2 \ The overall turbo code consists of the information bits and 
the two coded (parity) bit streams. The code rate is 1/3, as the turbo code has two coded 
frames for the same information frame. Then {4, />J 2) } are modulated and transmitted 

over communication channels. Additional interleavers and RSC encoders can be added to 
obtain codes that have lower rates and are more powerful. 

To construct turbo codes that have higher rates, the two convolutional encoder outputs 
( 1 ) ( 2 ) r 
p i and p i can be selectively but systematically discarded (e.g., by keeping only half the bits 

in and /^). This process, commonly referred to as puncturing, creates two RSC codes 
that are more efficient, each of rate 2/3. The total turbo code rate is therefore 1/2, since for 
every information bit, there are two coded bits (one information bit and one parity bit). 

Thus, the essence of turbo code is simply a combination of two component RSC codes. 
Although each component code has very few states and can be routinely decoded via decoding 
algorithms such as VA, SOVA, and BJCR, the random interleaver makes the overall code 
much more challenging to decode exactly because it consists too many states to be decoded 
by means of traditional MAP or VA decoders. Since each component code can be decoded 
by using simple decoders, the true merit of turbo codes in fact lies in iterative decoding, the 
concept of allowing the two component decoders to exchange information iteratively. 


Iterative Decoding for Ttorbo Codes 

It is important to note that naive iteration between two (hard) decoders cannot guarantee 
convergence to the result of the highly complex but exact turbo decoder. Turbo decoding is 
made possible and powerful by utilizing the previously discussed BCJR decoding algorithm 
(or its variations). Each component code can be decoded by using a BCJR soft decoding 
algorithm, BCJR soft decoding makes it possible for iterative turbo decoding to exchange soft 
information between the two soft decoders. 

The idea of iterative decoding can be simply described as follows. Given the channel 
output, both decoders can generate the soft information A (d{) according to Eq. (14.60): 


A 1 (d i ) = Af{di) + A.f{d i ) 


(14.61a) 

(14.61b) 
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Figure 14.21 

Exchange of 
extrinsic 
information 
between two 
component BCJR 
decoders for 
iterative turbo 
decoding. 



Note that A^ (di) and {*/;) are the a priori information on the information bit d t at decoder 
1 and decoder 2, respectively. Without any prior knowledge, the decoders should just treat them 
as 0 because dj = ± 1 are equally likely. 

Iterative decoding must allow the two low complexity decoders to exchange information. 
To accomplish this, decoder 1 can apply BCJR algorithm to find the LLR information about d^* 
It can then pass this learned information to decoder 2 as the a priori LLR. Note that this learned 
information must be unavailable to decoder 2 from its own decoder and other input signals. 
To provide innovative information, decoder 1 should remove any redundant information to 
generate its extrinsic information A^ 2 (d,-) to pass to decoder 2. Similarly, decoder 2 will 

find out its extrinsic information {*/,■) (previously unavailable to decoder 1) and pass 
it back to decoder 1 as a priori information for decoder 1 to refresh/update its LLR on d^. 
This closed-loop iteration will repeat multiple iterations until satisfactory convergence. The 
conceptual block diagram of this iterative turbo decoder appears in Fig. 14.2L 

We now use the example given by Bahl et al. 13 to explain how to update the extrinsic 
information for exchange between two soft decoders. Figure 14.21 illustrates the basic signal 
flow of the iterative turbo decoder. There are two interconnected BCJR MAP decoders. Let us 
now focus on one decoder (decoder 1) and its BCJR implementation. For the first systematic 
RSC code, the output code bits corresponding to the information bit d l are 

Cj[m, m] = ( di, p) V] ) 


To determine m \), it is necessary to specify the modulation format and the channel 

model. 

We consider the special and simple case of BPSK modulation under channel noise that is 
additive, white, and Gaussian. In this case, there are two received signal samples as a result of 
the coded bits Ci[m\ m] = (d iy }. More specifically, from encoder 1, the channel output 
consists of two signal sequences 


17,j = y/Eb(2di - l) + w; 
rff = V^(2p, U) -l) + w u 


(14.62a) 

(14.62b) 
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whereas from encoder 2, the channel outputs are 


fij = ^/El,(2dj - 1) + wj (14.63a) 

rfj = y/E b (2pf ] -1)4- w i>2 (14.63b) 

Note that the Gaussian noises w,, w M , and w /,2 are all independent with identical Gaussian 
distribution of zero mean and variance J\f/2. The first BCJR decoder is given signals and 
> 7 ^ to decode, whereas the second BCJR decoder is given signals r, L? and to decode. 
Let’s first denote pi[m\ m] as the zth parity bit at a decoder corresponding to message bit 
It naturally corresponds to the transition from state m f to state m. For each decoder, the 
received channel output signals r, = !>,> r ip ] specifies yi(m\ m) via 


Yi(m\ m) = p(n\ci[m\ m])P(dj) 

= P (n,s> r ip |d„ Pi[m\ mj) P(d() 


exp 


1 

7T'A ,r 

: ^ exp 

X P(di) 


\ru - ->/Et>(2di - 1)| 2 + j n., P - -jEb( 2 pi[m, m] - 1)| 2 


% + r i, P + 2 E h 
jV 


M 


P(di) 


exp | 7 ° [ r iA2di - \.) + r iiP (2p,\m , m] - 1 )]J 

(14.64) 


Notice that the first term in Eq. (14.64) is independent of the codeword or the transition from 
m to m. Thus, the LLR at this decoder becomes 


52_, -rteoni m])P[dj - 1 ]A-(m) 

Mdi ) = log - (14.65) 


~ log 


m = i] 
m=o] 


( 2 --^ 

^ l [r I , + 2r i , p p ; [m\m]]MA 

lu s-;—--- 

\—' , , 2 \ Er ; r .1 

+ ^ , »*]]|^(/«) 


By defining the gain parameter f = 4,/Eft/V, we can simplify the LLR into 


Mdi) = log 


P[di = 1] 
P[di = Oj 


A t«) 


+ f ■ n,s + log 
A (r) ■_ 


X^.mjeadl) exp ^ ’ r ^ iV ’ m] ) 
AW 


(14,66) 


In other words, for every information bit d,, the LLR of both decoders can be decomposed 
into three parts as in 

Midi) = A f\di) + A f\di) + Afldi) j= 1,2 
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where Ais the prior information provided by the other decoder, A f\di) is the channel 

output information shared by both decoders, and A (di) is the extrinsic information uniquely 
obtained by the j'th decoder that is used by the other decoder as prioT information. This means 
that at any given iteration, decoder 1 needs to compute 

A i (di) = A 2 iM") + C ■ r Ls + A\t 2 (di) (14.67a) 

in which A^. i d :: is the extrinsic information passed from decoder 2, whereas A( d ; ) is 
the new extrinsic information to be sent to decoder 2 to refresh or update its LLR via 

A2(4) = A [ il 2 {di) + f ■ r u + A<lj(4) (14.67b) 

At both decoders, the updating of the extrinsic information requires the updating of a /(m) 
and fii(fri) before the computation of extrinsic information 


A w 


= log 


6XP (< ‘ ri -P Pl[m> ’ m] ) Pi(m) 


(14,68) 


To refresh a pm) and £;(m) based on the extrinsic information A ' 1 . we need to recompute at 
each decoder 


Yi(m, m) = p(n\di)P(di) (14.69) 

- {(1 - di) + di exp[A (e) ] J exp(0.5C • r^exp (f ■ n,pPi[m\ mj) (14.70) 

Once decoder 1 has finished its BCJR decoding, it can provide its soft output as the 
prior information about di to decoder 2. When decoder 2 finishes its BCJR decoding, utilizing 
the prior information from decoder 1, it should provide its new soft information about di 
back to decoder 1. To ascertain that decoder 2 does not feed back the “stale’ information 
that originally came from decoder 1, we must subtract the stale information before feedback, 
thereby providing only the extrinsic information A ( f\, 2 (dj) back to decoder 1 as “priors” 
for decoder 1 in the next iteration. Similarly, in the next iteration, decoder 1 will update its 
soft output and subtract the stale information that originally came from decoder 2 to provide 
refreshed extrinsic information as priors for decoder 2. This exchange of extrinsic 

information is illustrated in Fig. 14.21. 

As an illustrative example, the original decoding performance of the turbo code proposed 
by Berrou et ak 12 is reproduced in Fig. 14.22. The results demonstrate the progressive perfor¬ 
mance improvement of successive iterations during iterative soft decoding. After 18 iterations, 
the bit error rate performance is only 0.7 dB away from the theoretical limit. 

14.12 LOW-DENSITY PARITY CHECK (LDPC) CODES 

Following the discovery of turbo codes, researchers carried out a flurry of acrivitity aimed 
at finding equally powerful, if not more powerful, error correcting codes that are suitable 
for soft iterative decoding. Shortly thereafter, another class of near-capacity codes known as 
low-density parity check (LDPC) codes, originally introduced by Gallager 15 in 1963, was 
rediscovered by Mac Kay and Neal 16 - 17 in 1995. Since then, LDPC code designs and efficient 
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Figure 14.22 

The decoding 
performance of a 
rate 1 /2 Turbo 
code is shown to 
be very close to 
the theoretical 
limit. 

(Reproduced 
with copyright 
permission from 
IEEE from 
Ref. 14.) 



means of LDPC decoding have been topics of intensive research in the coding community. 
A large number of LDPC codes have been proposed as strong competitors to turbo codes, 
often achieving better performance with comparable code lengths, code rates, and decoding 
complexity. 

LDPC codes are linear block codes with sparse parity check matrices. In essence, the parity 
check matrix H consists of mostly Os and very few Is, forming a low-density parity check 
matrix. LDPC codes are typically quite long (normally longer than 1000 bits) and noncyclic. 
Thus, an exact implementation of the ECJR MAP decoding algorithm is quite complex and 
mostly impractical. Fortunately, there are several well-established methods for decoding LDPC 
codes that can achieve near-optimum performance. 

The design of LDPC code is equivalent to the design of a sparse parity matrix H. Once H 
has been defined, the LDPC code is the null-space of the parity matrix H . The number of Is 
in the fth row of H is known as the row weight p if whereas the number of Is in the j'th column 
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is known as the column weight yj. In LDPC codes, both row and column weights are much 
smaller than the code length /?, that is, 

A n Yj 

For regular LDPC codes, all rows have equal w r eight a = P and all columns have equal 
weight Yi = y. For irregular LDPC codes, the row r weights and column weights do vary and 
typically exhibit certain weight distributions. Regular codes are easier to generate, whereas 
irregular codes with large code length may have better performance. 

Bipartite (Tanner) Graph 

A Tanner graph is a graphic representation that can conveniently describe a linear block code; 
This bipartite graph w'ith incidence matrix H w ? as introduced by R. M Tanner in J98L 1S 
Consider an (rc, k) linear block code. There are n coded bits and n — k parity hits. The Tanner 
graph of this linear block code has n variable nodes corresponding to the n code bits. These 
n variable nodes are connected to their respective parity nodes (or check nodes) according to 
the Is in the parity check matrix H. A variable node (a column) and a check node (a row) are 
connected if the corresponding element in H is a 1. Because H is sparse, there are only a few 
connections to each variable node or check node. These connections are known as edges. Each 
row represents the connection of a check node, and each column represents the connection of 
a variable node. For LDPC codes, if the fth row of H has row weight of a , then the check 
node has a edges. If column j has column weight of , then the variable node has y t edges. 
We use an example to illustrate the relationship between H and the Tanner graph. 


Example 14.9 


Consider a Hamming (7,4, 3) code with parity check matrix 


H = 


1110 10 0 ’ 
0 1110 10 
110 10 0 1 


(14.71) 


Determine its Tanner graph. 


Figure 14.23 

Tanner graph of 
the { 7 , 4, 3) 
Hamming code. 


This code has 7 variable nodes and 3 check nodes. Based on the entries in //, each check 
node is connected to 4 variable nodes. The first row of H corresponding the connection to 
check node 1. The nonzero entries of H mark the connected variable nodes. The resulting 
Tanner graph is shown in Fig. 14.23. 


Variable nodes 
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Because LDPC codes are typically of length greater than 1000, their Tanner graphs are 
normally too large to illustrate in practice. However, the basic Tanner graph concept is very 
helpful to understanding LDPC codes and its iterative decoding, 

A cycle in the Tanner graph is marked by a closed loop of connected edges. The loop 
originates from and ends at the same variable (or check) node. The length of a cycle is defined 
by the number of its edges. In Example 14.9, there exist several cycles of length 4 and length 6. 
Cycles of lengths 4 and 6 are considered to be short cycles. Short cycles are known to he 
undesirable in some iterative decoding algorithms for LDPC codes. When a Tanner graph is 
free of short cycles, iterative decoding of LDPC codes based on the so-called sum-product 
algorithm can converge and generate results dose to the full-scale MAP decoder that is too 
complex to implement in practice. 

To prevent a cycle of length 4, LDPC code design usually imposes an additional constraint 
on the parity matrix H : No two rows or columns may have more than one component 
in common. This property, known as the “row-column (RC) constraint,” is sufficient and 
necessary to avoid cycles of length 4. The presence of cycles is often unavoidable in most 
LDPC code designs based on computer searches, A significant number of researchers have 
been studying the challenging problem of either reducing the number of, or eliminating short 
cycles of length 4, 6, and possibly 8. Interested readers should consult the book by Lin and 
Costello. 2 

We now describe two decoding methods for LDPC codes. 

Bit-Flipping LDPC Decoding 

The large code length of LDPC codes makes decoding a highly challenging problem. Two of 
the most common decoding algorithms are the hard decision bit-flipping (BF) algorithm and 
the soft-decision sum-product algorithm (SPA). 

The bit-flipping (BF) algorithm operates on a sequence of hard-decision bits r = 
011010 - * ■ 010. Parity checks on r generate the syndrome vector 

s = rH T 

Those syndrome bits of value 1 indicate parity failure. The BF algorithm tries to change a bit 
(by flipping) in r based on how the flip would affect the syndrome bits. 

When a code bit participates in only a single failed parity check, flipping this bit at best 
will correct 1 failed parity check but will cause y — 1 new parity failures. For this reason, 
BF only flips bits that affect a large number of failed parity checks. A simple BF algorithm 
consists of the following steps: 2 


Step 1: Calculate the parity checks s = rH 7 . If all syndromes are zero, stop decoding. 

Step 2: Determine the number of failed parity checks for every bit: 

fi i = 1, 2, ..., n 

Step 3: Identify the set of bits F miDt with the largest/, and flip the bits in to generate a 
new codeword r'. 

Step 4. Let r — r* and repeat steps 1 to 3 until the maximum number of iterations has been 
reached. 
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Sum-Product Algorithm for LDPC Decoding 

The sum-product algorithm (SPA) is the most commonly used LDPC decoding method, it is 
an efficient soft-input, soft-output decoding algorithm based on iterative belief propagation. 
SPA can be better interpreted via the Tanner graph. SPA is similar to a see-saw game. In one 
step, every variable node passes information via its edges to its connected check nodes in 
the top-down pass-flow. In the next step, every check node passes back information to all the 
variable nodes it is connected to in a bottom-up pass-flow. 

To understand SPA, let the parity matrix be H of size./ x n where J = n — k for an (n, k) 
LDPC block code. Let the codeword be represented by variable node bits { vj , 7 = 1 , ,.., u}. 

For the 7 th variable node y/, let 


fij - [i : hij = L 1 < / < 7} 

denote the set of variable nodes connected to v,. For the ith check node n, let 

vi = V : hij = L 1 <j < n\ 

denote the set of variable nodes connected to z,. 

First, define the probability of satisfying check node z\ — 0 when vy = u as 

Rij(u) = P [ Zi = 0| Vj = u] H = 0, ] (14.72) 

Let us denote the vector of variable bits as v. Wc can use the Bayes theorem on conditional 
probability (See. ST) to show that 

= T, e [zi = o|f] ■ /»[v|v,- = u] 

V:vj~u 

- ^2 p = °I v j = H. : £ e <7;, £ ^j)] ■ P [{v f : £ e a it l £j) jv; = m] 

vn:f.e<Ti 

(14.73) 


This is message passing in the bottom-up direction. 

For the check node zi to estimate the probability P [{v^ ; l e i ^j}\vj — w], the check 
node must collect information from the variable node set 07 ’. Define the probability of v£ = x 
obtained from its check nodes except for the zth one as 

Qu(x) = P [vf = x\ {z m = 0 : me fL z *m£ /}] x = 0 , 1 (14.74) 

Furthermore, assume that the variable node probabilities are approximately independent. We 
can estimate 


P[{^ ; £ € ffi, £ ?M|vj = u] = n 2^(v*) 04.75) 

£ i , 


This means that the check nodes can update the message through 

*.y<«>= J2 P[*=0\vj=uAve: f] Qu(vd (14.76) 

VfAteO tE<J 
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Note that the probability of P [n = 0 |vj = u , \v t : £ € „ £ =£;}] is either 0 or 1; that is, the 
check node n = 0 either succeeds or fails. The relationship Eq, (14.76) allows Rij(u) to be 
updated when the /th check node receives Qn Of). 

Once R tJ (u) have been updated, they can be passed to the variable nodes in the bottom-up 
direction to update £>;,f 0). Again using Bayes’ theorem (Sec. 8.1), we have 


Qi.i(x) = 


P [vf = *1 P [ {z m = 0 : m € fi t , m f}|v£ — jc] 

= 0 : me fif, m /}] 


Once again assuming that each parity check is independent, we then write 


(14.77) 


P [{Zm = 0 : i}|vj = x] = ]~[ R m j(x) (14.78) 

rtteflg.m^i 


Now define the prior variable bit probability asp;(;c) = P(vf = x).Let be the normalization 
factor such that Qi.c (1) + Qi,e (0) — 1. We can update Q i:i ( x ) at the variable nodes based on 
Eq. (14.76): 


Qu O) = au ■ Pi (x) ]~[ R m .i (x) (14.79) 

ms flg< 


This message will then be passed back in the top-down direction to the check nodes. 
Figure 14.24 illustrates the basic operation of message passing in the bottom-up and the 
top-down directions in the sum-product algorithm. The SPA can be summarized as follows. 

Initialization: Let m = 0 and let m max be the maximum number of iterations. For every 
hi£ — 1 in H t use prior probabilities to set 

Qg’U) =Pf(I), and (2,7(0} = Pi(0) 


Step 1: Let the check node i update its information 


^(1)= ]P p[ Zi = o|v,-= Uvf}]- Yl e'?(v £ ). (14.80a) 

£ P [z,- = 0|vy = 0, {vj }] ■ ]"] Q\f(v ( ) (14.80b) 


Figure 14.24 

Message pass¬ 
ing in trie 
sum-product 
algorithm. 


Variable nodes 
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Step 2; At every variable node (indexed by £.), update 

■pdo) n d^ < i4 - sia > 

mfEpLy , m^i 

■Pitt) n do n4 - 8ib) 

where the normalization factor cr -™ +1} is selected such that 

ef:; + 1 ) (o.) + e^ ,+ 1 ) (i) = 1 

Step 3: At the variable nodes, also estimate the a posteriori probabilities 
f (H,+ 1 > [ V f = 0|r] = 

P {m+l) [ Vf = ljr] = 

where the normalization factor a]”' ~'' 
p(m+D [ Vf = o| r ] 

Step 4: Make hard decisions of each code bit 

Vf = sign \o\ 

If the decode codeword satisfies all parity checks, stop decoding. Otherwise, go back 
to step 1 for another iteration. 


af +I) -MO) f[ d<°> U4.82a) 

ajr +i) -p t {u n do < i4 * 82 ^ 

m 

is selected such that 
+ P (m+1) [v f = l|r] = 1 


ptm+o _ !| r j 

~p(,n+\) L = o| r ] 


2 <"' + 1 ) ( 0 ) = a[ 7 +1) 


e[" ,+ l) (l) =a[ 7 +1) 


Notice that external input signals {rd are involved only during the estimation of a priori 
probabilities Ml) and pe(0)< SPA uses the a priori probabilities as follows: 


Ml) = 


p{r\vt ~ 1) _ 

P(r\n = l) +p{r\vt = 0) 


and 


pm = 


/?(r|^ - 0 ) 

p(r |vf = 0 +p(c|vf = 0) 


For a more concrete example, consider the example of an AWGN channel with BPSK 
modulation. For the received signal sample is 


n = yfEbi 2v t - 1) + 


where Wf is Gaussian with zero mean and variance M/2. Because {r^J are independent, when 
we receive it = r?_ t we can simply use 


MD = 


1 + exp(-4^r f ) 


and pe(0) 


1 +exp(4^r^ 


This completes the introduction of sum-product algorithm for the decoding of LDPC codes. 
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14.13 MATLAB EXERCISES 

Tn this section, we provide MATLAB programs to illustrate simple examples of block encoders 
and decoders. We focus on the simpler case of hard-decision decoding based on syndromes. 


COMPUTER EXERCISE 14,1 

In the first experiment, we provide a program to decode the (6, 3) linear block code of Example ] 4.1. 


% Mat lab Program <Exl4._l.m> 

% to illustrate encoding and decoding 
% in Example 14.1 
% 

G=[l 0 0 1 0 1 
0 10 0 11 
0 0 1110]? 

H- [10 1 
Oil 
110 
10 0 
0 10 
001]'; 

E=[0 00000 
1 0 0 0 0 0 
0 1 0 0 0 0 
0 0 1 0 0 0 
0 0 0 1 0 0 
0 0 0 0 1 0 
0 0 0 0 0 1 
1 0 0 0 1 0 ]? 

K=size(E,1) ? 

Syndrome=mod(mtimes(E ; H J ),2)? 
r=[l 1 1 0 1 1 ] 

display(['Syndrome ','Error Pattern']) 
display(num2str{[Syndrome E])) 


of (6,3) block code 


%Code Generator 


%Parity Check Matrix 


%Lis t of correctable 


%Find Syndrome List 
%Keceived codeword 


x^mod(r*H r ,2); %Compute syndrome 

for kk=l:K, 


errors 


if Syndrome(kk,:)==x , 

idxe=kk; %Find the syndrome index 

end 


end 

syndrome^Syndrome(idxe,:) %display the syndrome 

error=E{idxe,;) 

cword=xor(r,error) %Error correction 


The execution of this MATLAB program will generate the following results* which include the 
erroneous codeword, the syndrome, the error pattern, and the corrected codeword. 

Exl4_2 

Syndrome Error Pattern 

000000000 

101100000 

011010000 

110001000 
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1 0 0 0 0 0 
0 1 0 0 0 0 
0 0 1 0 0 0 
11110 0 


10 0 
0 10 
0 0 1 
0 10 


syndrome = 

Oil 


error - 


0 10 


0 0 0 


cword - 


10 1 


Oil 


In our next exercise, we provide a program to decode the (7, 4) Hamming code of Example 14.3. 

% Matlab Program <Exl4_3.m> 

% to illustrate encoding and decoding of Hamming (7,4} code 
% 

G=[1000101 % Code Generating Matrix 

0 1 0 0 1 1 1 
0 0 10 110 
0 0 0 1 0 1 1 ]? 

H= [G ( : , 5 : 7 )', eye(3,3)]; %Parity Check Matrix 

E=[l 000000 %List of correctable errors 

0 1 0 0 0 0 0 

0 0 1 0 0 0 0 

0 0 0 1 0 0 0 

0 0 0 0 1 0 0 

0 0 0 0 0 1 0 

0000001 ]; 

K=size(E,1) ; 

Syndrome=mod(mtimes(E,H') H 2); %Find Syndrome List 

r=[l 010111] %Received codeword 

display(['Syndrome ', 'Error Pattern"]) 
display(num2str([Syndrome E])) 

x=mod(r*H',2); %Compute syndrome 

for kk=1: K, 

if Syndrome(kk,:} ==x, 

idxe=kk; %Find the syndrome index 

end 

end 

syndrome^Syndrome(idxe,:) %display the syndrome 

error=E(idxe,:) 

cword-xor(r,error) %Error correction 

Executing MATLAB program Exl4_3 .m will generate for a erroneous codeword r its syndrome, 
the error pattern, and the corrected codeword: 
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Exl4_3 
r = 


10 10 1 


1 1 


Syndrome Error 
10 110 
1110 1 
110 0 0 
0 110 0 
1 0 0 0 0 
0 10 0 0 
0 0 10 0 


Pattern 
0 0 0 
0 0 0 
10 0 
0 10 
0 0 1 
0 0 0 
0 0 0 


0 0 
0 0 
0 0 
0 0 
0 0 
1 0 
0 1 


syndrome = 


10 0 


error = 


0 0 0 0 


1 


0 0 


cword = 


10 10 


0 11 


COMPUTER EXERCISE 14.2 

In a more realistic example, we will use the Hamming (7,4) code to encode a long binary message bit 
sequence. The coded bits will be transmitted in polar signaling over an additive white Gaussian noise 
(AWGN) channel. The channel outputs will be detected using a hard-decision function sgn. The channel 
noise will lead to hard-decision errors. The detector outputs will be decoded using the Hamming (7,4) 
decoder that is capable of correcting 1 bit error in each codeword of length 7. 

This result is compared against the uneoded polar transmission. To be fair, the average E b /Af ratio 
for every information bit is made equal for both cases. MATLAB program S im7 4 Hamming .mis given; 
the resulting BER comparison is shown in Fig. 14.25. 


% Matlab Program <Sim74Hamming.m> 

% Simulation of the Hamming (7,4) code performance 
% under polar signaling in AWGN channel and performance 
% comparison with uncoded polar signaling 
clf;clear sigcw BER_uncode BER_coded 
G=[1000101 % Code Generator 

0 10 0 111 
0 0 10 110 
0 0 0 1 0 1 1 ]; 

H- [1110100 
0 1110 10 


% Parity Check Matrix 
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Figure 14.25 

Comparison of 
bit error rates of 
uncoded polar 
signaling trans¬ 
mission and 
polar signaling 
transmission of 
Hamming (7, 4) 
encoded 
[(dashed) and 
uncoded (solid) 
message bits. 



110 10 0 
E=[l 00000 
0 1 0 0 0 0 
0 0 1 0 0 0 
0 0 0 1 0 0 
0 0 0 0 1 0 
0 0 0 0 0 1 
0 0 0 0 0 0 
0 0 0 0 0 0 
K2=size(E,1); 


11 ; 

0 

0 

0 

0 

0 

0 

1 

0J ; 


Syndrom e=mod {mtimes (E , H ' ) ,2) ; 


% Error patterns 


% Syndrome list 


L1=25000;K=4*L1 


^Decide how many codewords 


sig_b=round(rand(1,K)); 
sig_2=reshape[sig_b,4 H bl) ; 
xig_l=mod(G' *sig_2,2] ; 
xig_2=2 *reshape (xig__l, 1,7*LI) -1 ; 
AWnoisel=randn(1,7 *L1); 

AWnoise2=randn(1,4 * LI) ; 

% Change SNR and compute BER's 
for ii=l:14 
SNRdb=ii; 

SNR=10 ~{SNRdb*0,1) ; 

xig_n=sqrt (SNR*4/7 ) *xig_2+AVJnoisel; 


%Generate message bits 
%4 per column for FEC 
%Encode column by column 
%P/S conversion 
%Generate AWGN for coded Tx 
%Generate AWGN for uncoded Tx 


%Add AWGN and adjust SNR 
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rig_l=(1+sign(xig_n)) / 2 ; 
r=reshape(rig_l,7,LI)'; 
x-mod(r*H ',2 ); 
for kl = l:LI, 

for k2=l:K2, 

if Syndrome{k2, :)==x(kl, :) , 

idxe=k2; 

end 

end 

error=E(idxe,:); 
cword=xor(r(kl,:),error); 
sigcw(:,kl)=cword{1:4); 

end 

cw=reshape(sigcw,1, K); 

BER_coded fii)=sum(abs(cw-sig_b))/K; 


%Hard decisions 

%S/P to form 7 bit codewords 

% generate error syndromes 


^find the Syndrome index 

%look up the error pattern 
terror correction 
%keep the message bits 

% Coded BER on info bits 


% Uncoded Simulation Without Hamming code 

3=2*sig_b-l; % Polar signaling 

xig_m=sqrt(SWR)*xig_3+AWnoise2; % Add AWGN and adjust SNR 

rig_l=fl-hsign(xig_m) )/2; % Hard decision 

BER_uncode(ii)=sum(abs{rig_l-sig_b))/K; % Compute BER 

end 

EboverN^[1:Id]-3; % Need to note that SNR = 2 Eb/N 


Naturally when the E b /M is low, there tends to be more than 1 error bit per codeword. Thus, when 
there is more than 1 bit error, the decoder will still consider the codeword to be corrupted by only 1 bit 
error* Its attempt to correct I-bit error may in fact add an error bit. When the E^jA f is high, it is more 
likely that a codeword has at most 1 bit error. This explains why the coded BER is worse at low ? er E^jAf 
and better at higher E&/A . On other other hand. Fig. 143 gives an optimistic approximation by assuming 
a cognitive decoder that will take no action when the number of bit errors in each codeword exceeds 1. 
Its performance is marginally better at low- E b /Nf ratio. 
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PROBLEMS 


14*1-1 Golay’s (23, 12) codes are three-error correcting codes. Verify that n — 23 and k = 12 satisfies 
the Hamming bound exactly for t = 3. 

14.1- 2 (a) Determine the Hamming bound for a ternary' code (whose three code symbols are 0, L 2), 

(b) A ternary' (11,6) code exists that can correct up to two errors. Verify that this code .satisfies 
the Hamming bound exactly. 

14.1- 3 Confirm the possibility of a (18, 7) binary' code that can correct up to three errors. Can this 

code correct up to four errors? 

14.2- 1 If G and H are the generator and parity check matrices, respectively, then show that 

GH r = 0 


14.2-2 Given a generator matrix 


G = [ 1 1 H 


construct a (3, 1) code, How r many errors can this code correct? Find the codeword for data 
vectors d = 0 andd — 1. Comment, 


14.2-3 Repeat Prob. 14.2-2 for 


This gives a (5, I) code. 

14,2-4 A generator matrix 


G — [ I 1111] 


G = 


10 11 " 
0 110 
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generates a (4,2) code, 

(a) Is this a .systematic code? 

(b) What is the parity check matrix of this code? 

(c) Find the codewords for all possible input bits. 

(d) Determine the minimum distance of the code and the number of bit errors this code can 
correct. 

142-5 Consider the following (k + 1 , k) systematic linear block code with the parity check digit c k+ { 
given by 

c k + } = d, -hd 2 +— -f-4 (14.83) 

(a) Construct the appropriate generator matrix for this code, 

(b) Construct the code generated by this matrix for k — 3, 

(c) Determine the error detecting or correcting capabilities of this code. 

(d) Show that 


cH t =0 


and 


rH 7 


0 

1 


if no error occurs 
if single error occurs 


14.2-6 Consider a generator matrix G for a nonsystemalic ( 6 , 3 ) code: 


G = 


o i i i o r 
1110 10 
1 1 0 0 0 1 


Construct the code for this G, and show thatd min , the minimum distance between codewords, 
is 3. Consequently, this code can correct at least one error. 


14.2-7 Repeat Prob, 14 r 2-6 if 


G = 


10 0 0 
0 10 1 
0 0 11 


i r 
0 1 
1 0 


14.2-8 Find a generator matrix G for a (15, II) single-error correcting linear block code. Find the 
codeword for the data vector 10111010101 . 


14.2-9 For a ( 6 , 3) systematic linear block code, the three parity-check digits < 74 , C 5 , and eg are 

C4 = d\ + d 2 + ds 
+ d 2 
c 6 = + d% 

(a) Construct the appropriate generator matrix for this code. 

(b) Construct the code generated by this matrix. 
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(e) Determine the error correcting capabilities of this code, 

(d) Prepare a suitable decoding table. 

(e) Decode the following received words: 101100, 000110,101010. 

14*2-10 (a) Construct a code table for the (6, 3) code generated by the matrix G in Prob. 14.2-6, 

(b) Prepare a suitable decoding table, 

14*2-11 Construct a single-error correcting (7, 4) linear block code (Hamming code) and the 
corresponding decoding table, 

14*2-12 For the (6, 3) code in Example 14,1, the decoding table is Table 14.3. Show that if we use this 
decoding table, and a two-error pattern 010100 or 001001 occurs, it will not be corrected. If 
it is desired to correct a single two-error pattern 010100 (along with six single-error patterns), 
construct the appropriate decoding tabic and verify that it does indeed correct one two-error 
pattern 010100 and that it cannot correct any other two-error patterns, 

14*2-13 (a) Given k = 8, find the minimum value of n for a code that can correct at least one error. 

(b) Choose a generator matrix G for this code. 

(c) How many double citofs can this code correct? 

(d) Construct a decoding table (syndromes and corresponding correctable error patterns). 

14.2-14 Consider a (6, 2) code generated by the matrix 

"l 0 1 1 1 if 
0 110 1 I 

(a) Construct the code table for this code and determine the minimum distance between 
codewords, 

(b) Prepare a suitable decoding table. 

Hint: This code can correct all single-error patterns, seven double-error patterns, and two 
triple-error patterns. Choose the desired seven double-error patterns and the two tripie-error 
patterns, 

14.3- 1 (a) Use the generator polynomial g(x) = x 3 ' + x -f- I to construct a systematic (7, 4) cyclic 

code. 

(b) What are the error correcting capabilities of this code? 

(c) Construct the decoding table, 

(d) If the received word is 1101100, determine the transmitted data word, 

14*3-2 A three-error correcting (23, 12) Golay code is a cyclic code with a generator polynomial 

g(x) =x l[ + X 9 -p X 1 4- X 6 4- x 5 4- 1 

Determine the codewords for the data vectors 000011110000, 101010101010, and 
11000101011110 , 

14.3- 3 Factorize the polynomial 


_r* + x 2 + x + 1 
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Hint : A third-order polynomial must have one factor of first order. The only first-order poly¬ 
nomials that are prime (not factor!zable) are x and x + 1, Since x is not a factor of the given 
polynomial, try x -h 1 . Divide x 3 + x 2 + x + 1 by * + 1 . 

14.3- 4 The concept explained in Prob. 14,3-3 can be extended to factorize any higher order polynomial 

Using this technique, factorize 

x 5 + A- 4 + X 2 + 1 

Hint: There must be at least one first-order factor. Try dividing by the two first-order prime 
polynomials x and x + 1. The given fifth-order polynomial can now be expressed as (f>\ (x) 0 4 (x), 
where <p\ (x) is a first-order polynomial and 0 4 (x) is a fourth-order polynomial that may or may 
not contain a first-order factor. Try dividing 0 4 (x) by x andx-H 1. If it does not work, it must have 
two second-order polynomials both of which are prime. The possible second-order polynomials 
arex 3 ,x 2 + l,x 2 +x, and x 3 +x + L Determine which of these are prime (not divisible by x 
or x + i), Now try dividing 0 4 (x) by these prime polynomials of the second order. If neither 
divides, 04 (x) must be a prime polynomial of the fourth order and the factors are and 
04 (x). 

14.3- 5 Use the concept explained in Prob, 14,3-4 to factorize a seventh-order polynomial jP -h 1. 

Hint: Determine prime factors of first-, second-, and third-order polynomials. The possible 
third-order polynomials arcx 3 ,x 3 + l,x 3 + x,x 3 + x - 1 - I ,x 3 + x 2 , x 3 + x 2 + 1 ,x 3 4 - x 2 +x, 
and x 3 +x 2 +x -f- 1 . See hint in Prob. 14.3-4, 

14.3- 6 Equation (14,16) suggests a method of constructing a generator matrix G / for a cyclic code. 



V-’*(*)" 


~£l 

S 2 


0 

0 ■ 

0 “ 

G f = 

X k ~ 2 g(x) 

= 

0 

£T 

82 

&n-k + \ 

0 - 

0 


_ £(■*) _ 


0 

0 

0 

S 1 

82 J 

L - Sw-Jt + l. 


where g(x) = g\x n k -f g 2 X n k 1 H-b Sn-k +1 is the generator polynomial This is, in 

general, a nonsystematic cyclic code, 

(a) For a single-error correcting (7, 4) cyclic code with a generator polynomial #(x) = x 3 4- 
x 2 + 1 , find G f and construct the code. 

(b) Verify that this code is identical to that derived in Example 14.3 (Table 14.4). 

14.3-7 The generator matrix G for a systematic cyclic code (see Prob. 14.3-6) can be obtained by 
realizing that adding any row of a generator matrix to any other row yields another valid 
generator matrix, since the codeword is formed by linear combinations of data digits. Also, a 
generator matrix for a systematic code must have an identity matrix /£ in the first It columns. 
Such a matrix is formed step by step as follows. Observe that each row in G f in Prob. 14,3-6 is 
a left shift of the row below it, with the last row being g(x). Start with the£th (last) row g(x). 
Because g(x) is of the order n — k, this row has the element 1 in the £th column, as required. 
For the (k — 1 )th row, use the last row with one left shift. We require a 0 in the £th column of 
the (k - l)th row to form 4 . If there is a 0 in the kth column of this (k - l)th row, we accept 
it as a valid (k - 1 )th row. If not, then we add the kth row to the (A: — I )th row to obtain 0 in 
its kth column. The resulting row is the final (k — 1 )th row. This row with a single left shift 
serves as the (k - 2)th row. But if this newly formed (k — 2)th row does not have a 0 in its 
kth column, we add the £th (last) row to it to get the desired 0, We continue this way until all 
k rows have been formed. This gives the generator matrix for a systematic (n,k) cyclic code. 
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(a) For a single-error correcting (7, 4) systematic cyclic code with a generator polynomial 
g(x) = x 3 H- x 1 + 1, find G and construct the code. 

(b) Verify that this code is identical to that in Table 14.5 (Example 14.4). 

14.3- 8 (a) Use the generator polynomial #(v) = jr* H- .t -P 1 to find the generator matrix G f for a 

nonsystematic (7, 4) cyclic code ♦ 

(b) Find the code generated by this matrix G '. 

(c) Determine the error correcting capabilities of this code. 

14.3- 9 Use the generator polynomial g(x) = x 3 +x + I (seeProb. 14.3-8) to find the generator matrix 

G for a systematic (7, 4) cyclic code. 

14.3- 10 Discuss the error correcting capabilities of an interleaved (Xn, Xk) cyclic code with 7 = 10 

and using a three-error correcting (31, 16) BCH code. 

14.3- 11 The generator polynomial 

g(x) = X 10 +x* + .r i +JC 4 +x 2 + x + 1 

generates a cyclic BCH (15, 5) code. 

(a) Determine the (cyclic) code generating matrix. 

(b) For encoder input data*/ = 10110, find the corresponding codeword. 

(c) Show how many errors this code can correct, 

14.4- 1 Uncoded data is transmitted by using PSK over an AWGN channel with = 9. This data 

is now coded using a three-error correcting (23, 12) Golay code (Prob. 14.1-1) and transmitted 
over the same channel at the same data rate and with the same transmitted power. 

(a) Determine the corrected error probability P$ u and P^ for the coded and the uncoded 
systems. 

(b) If it is decided to achieve the error probability P ec computed in part (a), using the uncoded 
system by increasing the transmitted power, determine the required value of E^jH, 

14.4- 2 The simple code for detecting burst errors (Fig, 14,4) can also be used as a single-error correcting 

code with a slight modification. The k data digits are divided into groups of b digits in length, 
as in Fig. 14.4. To each group we add one parity check digit, so that each segment now has 
b + 1 digits (b data digits and one parity check digit). The parity check digit is chosen to ensure 
that the total number of Is in each segment of b 4- 1 digits is even. Now we consider these 
digits as our new data and augment them with the last segment of b -f 1 parity check digits, as 
was done in Fig. 14.4. The data in Fig. 14,4 will be transmitted thus: 

10111 01010 11011 10001 11000 01111 

Show that this (30, 20) code is capable of single error correction as well as the detection of a 
single burst of length 5, 

14.5- 1 For the convolutional encoder in Fig. 14.5, the received bits are 01 00 01 00 10 11 

11 00, Use ViterbPs algorithm and the trellis diagram in Fig. 14.8 to decode this sequence. 

14.5- 2 For the convolutional encoder shown in Fig, PI4.5-2: 

(a) Draw the state and trellis diagrams and determine the output digit sequence for the data 
digits 11010100. 
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(b) Use Vilerbris algorithm to decode the following received sequences: 


(i) 

too 

110 

m 

101 

001 

101 

001 

010 

(ii) 

010 

no 

m 

101 

101 

101 

001 

010 

(iii) 

111 

no 

in 

in 

001 

101 

001 

101 


Figure 

P.14.5-2 



Figure 

P.14.5-3 


14.5-3 A systematic recursive convolution encoder (Fig. PI4.5-3) generates a rate 1/2 code. Unlike 
earlier examples, this encoder is recursive with feedback branches. It turns out that we can 
still use a simple trellis and state transition diagram to represent this encoder. The maximum 
likelihood Viterbi decoder also applies. Denote the state value as {d^-i, rf*_ 2 ). 

(a) Illustrate the state transition diagram of this encoder. 

(b) Find the corresponding trellis diagram, 

(c) For an input data sequence of 0100110100, determine the corresponding codeword. 


d k 



14.6-1 A block code has parity check matrix 


" 1 0 1 0 1 ' 

0 1 0 1 ! 


H = 
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(a) Find the code-generating matrix of this code. 

(b) Find the minimum distance. 

(c) Find the trellis diagram. 

14.6-2 For the block code in Prob, 14.2-9, 

(a) Find the code-generating matrix, 

(b) Find the minimum distance. 

(c) Find the trellis diagram. 
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A.l Orthogonality of the Trigonometric and Exponential Signal Set 

Consider an integral / defined by 


/ = / cos ncoQt cos mishit (A.la) 

where stands for integration over any contiguous interval of 7b = Ztt/coq seconds. By 
using a trigonometric identity (Appendix E), Eq. (A. 1 a) can be expressed as 


/ 


1 

2 



cos (n + dt 4- 


L 


cos (n — dt 


(A* lb) 


Since cos aiQt executes one complete cycle during any interval of 7b seconds, cos (n + m)coot 
executes (n+m) complete cycles during any interval of duration 7b. Therefore, the first integral 
in Eq. (A.lb), which represents the area under (n + m) complete cycles of a sinusoid, equals 
zero* The same argument shows that the second integral in Eq. (Adb) is also zero, except when 
n — m. Hence, / in Eq. (A.lb) is zero for all n ^ m. When n = m, the first integral in Eq, 
(A, 1 b) is still zero, but the second integral yields 


7 



Thus, 


L 


cos naoiyt cos mojQtdt 


R 

2 


n ^ m 
m — n 4 0 


We can use similar arguments to show that 


and 


/, 

i. 


sin moot sin nuootdt = 


0 , 

Zii 


sin ncoot cos max)t dt — 0 


n /= m 
n = m£ 0 


all n and m 


(A. 2a) 


(A.2b) 


(A. 2c) 
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A.2 Orthogonality of the Exponential Signal Set 

The set of exponentials e? nmt (n = 0, =fc 1 , ±2, ...) is orthogonal over any interval of duration 
To, that is, 


[ [ */(«-*>**'<a = 

M JTq 

Let the integral on the left-hand side of Eq. (A3) be /, where 

/= f e ima3 ° t (e iftwot )* dt 

JTq 

= f & 

Jtq 


0 m n 
7b m — n 


(A3) 


(A 4) 


The case of m = n is trivial: the integrand is unity, and / — 7b, When m / n, however, 


/ = 


1 


j(m-n)w o 

1 










i] = o 


j(m - n)<D o 

The last result follows from the fact that mo 70 = 2tt, and m ; 1tA = 1 for all integral values of k. 
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CAUCHY-SCHWARZ INEQUALITY 

Prove the following Cauchy-Schwarz inequality for a pair of real finite energy signals/(r) 
and £(/): 


" c b 

2 


fb _ 

f f(t)g(t)dt 
Ja 

■— j 

< 

/ f\t)dt 
Ja 

/ g 2 U)dt 
Ja 


with equality only if g(t) = cf(t ), where c is an arbitrary constant 

The Cauchy-Schwarz inequality for finite energy, complex-valued functions X(co) and 
Y(o)) is given by 


/: 


X{a>)Y(a>) dco 


~ f-J XiC0) \ 2dC °L 


I X{o))\ 2 d(o / \Y(co)\ 2 da> 


(B.2) 


with equality only if Y(co) = cX*(co), where c is an arbitrary constant. 

We can prove Eq. {B.l) as follows: for any real value of A, we know that 


/ 


[kf(t)-g(t)fdt>0 


or 


(B.3) 


f f 2 (t)dt-2kf f(t)g(t)dt+ f g 2 (t)dt > 0 (B.4) 

Ja J a J a 


Because this quadratic equation in X is nonnegative for any value of A., its discriminant must be 
nonpositive, and Eq. (B.l) follows. If the discriminant is zero, then for some value of A — c y 
the quadratic equals zero. This is possible only if cf{t) — g(t) = 0, and the result follows. 

To prove Eq, (B + 2), we observe that |X(w)| and |T(^)] are real functions and inequality 
Eq, (BT) applies. Hence, 



2 


\X(a>)Y{a>)\da> 


< 


f b \X(a>)\ 2 dco [ l> \Y(<o)\ : 
Ja Ja 


dm 


(B.5) 
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with equality only if | Y(w)| = c X ( co ) . where c is an arbitrary constant. Now recall that 


f 


X(aj)Y(aj)da) 


f 


\X(a))\\Y(aj)\dco 


-f 


\X(a))Y(a))\da> 


(B.6) 


with equality if and only if Y(oj) = cX*(co ) f where c is an arbitrary constant. Equation (B.2) 
immediately follows from Eqs. (B.5) and (B.6). 
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GRAM-SCHMIDT ORTHOGONALIZATION 
OF A VECTOR SET 

We have defined the dimensionality of a vector space as the maximum number of independent 
vectors in the space. Thus in an Af-dimensional space, there can be no more than N vectors 
that are independent. Alternatively, it is always possible to find a set of N vectors that are 
independent* Once such a set has been chosen, any vector in this space can be expressed in 
terms of (as a linear combination of) the vectors in this set. This set forms what we commonly 
Tefer to as a basis set, which forms the coordinate system* This set of AT independent vectors 
is by no means unique. The reader is familiar with this propery in the physical space of 
three dimensions, where one can find an infinite number of independent sets of three vectors. 
This is clear from the fact that we have an infinite number of possible coordinate systems. 
An orthogonal set, however, is of special interest because it is easier to deal with than a 
nonorthogonal set. If we are given a set of N independent vectors, it is possible to obtain 
from this set another set of N independent vectors that is orthogonal. This is done by the 
Gram-Schmidt process of orthogonalization* 

In the following derivation, we use the result [derived in Eq* (2*27)] that the projection 
(or component) of a vector X 2 upon another vector xi (see Fig* C. 1) is t'i 2 xi, where 


C12 = 


< Xi, X 2 > 

llxill 2 yi 


The error in this approximation is the vector X 2 - ci 2 xi, that is. 


< *i, x 2 > 

error vector = x 2 -^—~\| 

llxill 2 


(C.l) 


(C-2) 


The error vector, shown dashed in Fig* Cl is orthogonal to vector xj. 

To get physical insight into this procedure, we shall consider a simple case of two- 
dimensional space* Let xi and x 2 be two independent vectors in a two-dimensional space 
(Fig. C.l). We wish to generate a new set of two orthogonal vectors yi and y 2 from xi and x 2 . 
For convenience, we shall choose 


yi = xi (C*3) 

We now find another vector y 2 that is orthogonal to y 3 (and Xi. Figure C. 1 shows that the error 
vector in approximation of x 2 by y i (dashed lines) is orthogonal to y i, and can be taken as y 2 . 
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Hence, 


y -2 = X 2 - 


Xj, X 2 

XI II 2 


= X2 - 


< yi. > 
llyi II 2 


yi 


(C.4) 


Equations (C.3) and (C.4) yield the desired orthogonal set* Note that this set is not unique. 
There is an infinite number of possible orthogonal vector sets (yi, y 2 > that can be generated 
from (xi, X 2 ). In our derivation, we could as well have started with y = X 2 instead of yi = xi. 
This starting point would have yielded an entirely different set* 

The reader can extend these results to a three-dimensional case. If vectors xi, X 2 , X 3 form 
an independent set in this space, then we form vectors yi and y 2 as in Eqs, (C.3) and (C. 4 ), To 
determine y 3 , we approximate X 3 in terms of vectors yj and y 2 . The error in this approximation 
must be orthogonal to both yj and y 2 and, hence, can be taken as the third orthogonal vector 
y 3 * Hence, 


y 3 — X 3 — sum of projections of X 3 on yi and y 2 


= X 3 


< yi, X3 > 
llyill 2 


-yi 


< yi* x^ >, 
I|y2ll 2 


-y2 


(C.5) 


These results can be extended to an Af-dimensional space. In general, given N independent 
vectors xi, X 2 , ..., xau if we proceed along similar lines, we can obtain an orthogonal set 
yi*y2.y/v, where 


yi =xi 


(C. 6 ) 


and 


yj = *j - E 

*=1 


< yt, 
lly* II 2 


-yt 


2,3 . N 


(C.7) 


Note that this is one of the infinitely many orthogonal sets that can be formed from the set 
xi, X 2 , * * *, xjv. Moreover, this set is not an orthonormal set. The orthonormal set yi, y 2 , * *., 
can be obtained by normalizing the lengths of the respective vectors, 


' lly* II 

We can apply these concepts to signal space because one-to-one correspondence exists between 
signals and vectors. If we have AT independent signals jc* ( t ), *2(0. ■. ■, xn(*)* we can f° rm a 
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set of N orthogonal signals >1 (/), >2(0. }'n(0 as 

j-i 

>;(0 = xj(t) - J2 c kj>’k (0 j = 2, 3 . N (C.8) 

Jt=l 


where 


_ Jy k (t)xj(t)dt 
kj fyj(t)dt 


(C.9) 


Note that this is one of the infinitely many possible orthogonal sets that can be formed from 
the set x\ (t), *2(0, ■ ■ * > *w(0' The set ean be normalized by dividing each signal yj(t) by its 
energy- 


Example C. 1 The exponential signals 


£i(0 = e pt u(t) 
gli 0 = e~ lp ‘u(t ) 

8nU) = e~ Np1 u{t) 

form an independent set of signals in N -dimensional space, where N may be any integer. This 
set, however, is not orthogonal. We can use the Gram-Schmidt process to obtain an orthogonal 
set for this space. If yi (f), ?2(0> ♦,., >w(0 is the desired orthogonal basis set, we choose 

>i(0 =£i(0 = e~ pr u(t) 

From Eqs. (C8) and (C.9) we have 

y2(0 =*2(0 - cnyiti) 


where 


_ /^yi (0*2(0 dt 

C12 r«7f(0A 

/ 0 °° e~ pt e~ lpt dt 

= " f 0 °°e- 2pt dt 
_ 2 
“ 3 

Hence, 

>2(0 = (e~ 2pr ~ \e~ p >(t) (C.10) 

Similarly, we can proceed to find the remaining functions y3(0, ..., y/v(0, and so on. The 
reader can verify that all this represents a mutually orthogonal set. 
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BASIC MATRIX PROPERTIES AND OPERATIONS 

D.l Notation 

An n x I column vector* consists of n entries and is formed by 


*1 

X2 


(D.la) 


The transpose of* is a row vector represented by 


= [ *1 *2 ■ ■ ■ x „ ] 


(D.lb) 


The conjugate transpose of * is also a row vector written as 


x H = ( x*) T = [ ** 


*:] 


(D.lc) 


x H is also known as the Hermitian of *. 

An m x n matrix consists of n column vectors 


U\ &2 


] 

"1,1 

«1,2 

■ ■ J ^1,/t 

02,1 

«2,2 

■ ■ ■ 


&m,2 



(D.2a) 


(D.2b) 


We also define its transpose and Hermitian, respectively, as 



~ flu 

■ 

0JH,l 


r«r.. 

^2,1 T 


A r = 

G[,2 

^2,2 ■ 

1 ' a m,2 

II 


^2,2 ’ 

r ' a m2 


_ 


* + 


_ i 

a i* ■ 



(D.2c) 
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* If A r = A, then we say that A is a symmetric matrix. 

* If A u = A, then we say that A is a Hennitian matrix. 

■ If A consists of only real entries, then it is both Hermitian and symmetric. 


D.2 Matrix Product and Properties 

For an m x n matrix A and an n x t matrix B w ith 


*1,1 

h\2 L 

■ ■ b \,t 

^2,] 

hjj * 

■ ■ foj. 


bn2 ' 

■ ■ *ir.f 


(D.3) 


the matrix product C 


C — 


= A 

■ B has dimension m x l and equals 

c iy 

C‘1,2 ■ 

■ ■ a\j. 


C 2A 

£'2,2 

■ ■ ai.e. 

w'here 

Crti.l 

Cm2 1 ' 

' ' 



c u 



(DA) 


In general A# ^ BA * In fact, the products may not even be well defined. To be able to multiply 
A and B t the number of columns of A must equal the number of row's of B. 

In particular, the product of a row' vector and a column vector is 


y H * = 

Jt=l 

= <x,y > 


(D.5a) 

(D.5b) 


Therefore, x H x = ||jr|| 2 . 

Two vectors a: andy are orthogonal ify^jc — x H y = 0. 

There are several commonly used properties of matrix products: 


A(B + C) 

= AB + AC 

(D.6a) 

A(BC) 

= (AB)C 

(D.6b) 

(AB)* 

= A*B* 

(D + 6c) 

(AB) r 

= B t A t 

(D.6d) 

(AB) h 

II 

6a 

(D.6e) 
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D.3 Identity and Diagonal Matrices 

An n x n square matrix is diagonal if all its off-diagonal entries are zero, that is, 


D = diag d 2 . d„) 

~ d, 0 0 ■ ■ ■ 0 

0 d 2 o ■ ■. 0 


0 '• 0 d n _i 0 

0 0 ■■■ 0 d n 

An identity matrix /„ has unit diagonal entries 

"1 0 ... 01 

0 1 ... 0 


(D.7a) 


(D.7b) 


(D.8) 


For an r x ft square matrix A, if there exists a n x n square matrix B such that 


BA=AB= l n 


then 


B=A l (D.9) 

is the inverse matrix of A. For example, given a diagonal matrix 


D = diag (di, d 2 . d„) 

D ~' = ^{hk .£) 


D.4 Determinant of Square Matrices 

The determinant of n x n square matrix A is defined recursively by 


n 

det(4) = (D.10) 

i = l 

where M ^ is an (n — 1) x (n — 1) matrix known as the minor of A by eliminating its fth row 
and its >th column. Specifically, for a 2 x 2 matrix, 


det 




a 

c 


b 

d 


= ad — be 
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Based on the definition of determinant, for a scalar a y 

det(aA) = a H det(A) (D.lla) 

det(A r ) = det (A) (D.llb) 

For an identity matrix 

det (/) = 1 (DJlc) 

Also, for two square matrices A and B, 

det (AB) = det (A) det (B) (DAld) 

Therefore, 

det (aA - 1 ^} — det (A) det ^A _1 j = I (D.l le) 

For anmx/i matrix A and an n x m matrix B, we have 


det (/ m +AB) = det (/„+BA) 


(DA 2) 


D.5 TVace 

The trace of square matrix A is the sum of its diagonal entries 

n 

Tr<A) = ^a;, ; (D.13) 

r=1 

For an m x n matrix A and an n x m matrix B, we have 

Tr(AB) = Tr(BA) (D.14) 


D.6 Eigendecomposition 

If the n x n square matrix A is Hermitian, then the equation 

Au = Xu (D.l 5) 

specifies an eigenvalue k and the associated eigenvector u. 

When A is Hermitian, its eigenvalues are real-valued. Furthermore, A can be decomposed 

into 


A = UAV h 

in which the matrix 


V = [u\uz • ■ ■ «„] 

consists of orthogonal eigenvectors such that 

VU H =I n 


(D. 16) 


(D.17) 


(D.l 8) 


Matrices satisfying this property are called unitary. 
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Furthermore, the diagonal matrix 

A = diag(Ai X 2 *«) 

consists of the corresponding eigenvalues of A. 

Because 


V H V = UU H =/„ 


we can also write 


U h AU = A 

The eigenvalues of A are very useful characteristics. In particular, 

n 

del (A) = k; 
i= I 
n 

Trace (A) — A, 
i =l 

D*7 Special Hermitian Square Matrices 

Let an n x n matrix A be Hermitian. A is positive definite if for any n x 1 vector x 
have 


x H Ax > 0 

A is semipositive definite if for any n x 1 vector*, we have 

x h Ax > 0 

A is negative definite if for any n x 1 vector* ^ 0, we have 

x h Ax < 0 


(DJ9) 

(D.20) 

(D/21) 

(D.22a) 

(D.22b) 

^ 0, we 

(D.23) 

(D.24) 

(D.25) 


A is positive definite if and only if all its eigenvalues are positive. 
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MISCELLANEOUS 

E.l L’Hopital’s Rule 

If lim f(x)/g(x) results in the indeterministic form 0/0 or oc/oo, then 


lim/ -n = ta/ A 

g(x) g(x) 


E.2 Taylor and Maclaurin Series 


fix) = f(a ) + 


{x — a) 

“IT 


■/(«> + 


(x - a)‘ 
2 ! 


-/(a) + ■ 


f(x) =/(0) + i/(0) + ^/(0) + 


E.3 Power Series 


e*=l+x+- + 


2! 3! 


+ ^! + ' 




sinx 


cosx = 1 


X' x 

: "-3! + 5! 


2! + 4! 


4- 


7! 

x 6 x z 
6 ! + 8 ! “ 


* 3 2r 5 17 jc 7 

tan x — x + —— -|- — + -- -\- 

3 15 315 


2 * 

X < — 

4 


CM 


€ X / 2 / 
x^fhz \ 


1 1-3 1-3-5 


H" 


■f 


(1 + »)’ - 1+ « + + " ( ” ~ ^ ~ ^ + 

SS 1+ nx |jc| 1 

■- — 1 4“ x 3 4~ x 3 H- |j^J < l 


+ ( ,1^ + ■ 
{hi 


+ x n 


(E.l) 
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£.4 Sums 


m=0 

N 




1 


m=M 


r- \ 

^+1 _ yM 

r-^1 


m =0 

E.5 Complex Numbers 

= ±j 
P ±m — 


/ a \ m — a * +1 ~~ ^ i+1 
“ V6/ “ b k (a - b) 


r# 1 

r* 1 

a^b 


{, . 

[-1 n 


even 

odd 


= cos 9 ± / sin <9 

a +;£> = r = V« 2 + & 2 , 0 = tan' 

(re*)* = 


(n^)^) = nr 2 ^ (ei+ ^ ) 

E.6 Trigonometric Identities 




COS X = 


COS 


sin 


cos jc ± y sin x 

I(^ + e -A) 

sinjt = — e~j x ) 

V 

=Fsmx 
±cosx 
2 sin x cos x — sin 2x 

1 

cos 2_r 
(1 + cos 2x) 

(1 — cos2x) 


(^i) 

K) 


sin 2 x-bcos 2 x 


cos 2 x — sin 2 x 


cos 2 x 


sin 2 x : 
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COS X 

sin 3 x 

sin (x ± y) 
cos (x ± v) 

tan(x ± y) 
sin x sin y 
cos jc cos y 

sin x cos y 
a cosx H- b sinx 
in which C 

E.7 Indefinite Integrals 


-(3cosx 4- cos3x) 

4 

7(3 sinx — sin 3x) 

4 

sinx cosy ± cosx sin y 
cosx cosy =f sinx sin y 
tan x i tan y 
1 tan x tan y 

^[cos (x — y) — cos (x 4- y)] 

1 

- [cos (x — y) + cos (x + y)] 
1 t 

-[sin (x-y) + sin (x + y)] 
C cos (x -f 9) 

\fa 2 -\-b 2 and 6 = tan -1 


/ 


J u dv ^ uv — J v 


du 


f(x)g(x)dx =/(x)£(x) - / f(x)g(x)dx 


/ 
/ 

/* 
/« 
I* 
/ 




sin ax dx — — cos ax 
a 

. 7 . * sin2ac 

sin ax dx — - - 

2 4a 

sin ax dx = (sin ax — ax cos ax) 
a 1 

cos axdx = \ (cos ax + ax sin ax) 


1 


/ 


cos ax dx = - sin ax 
a 


/ 


2 , 

cos ax dx = - + 


sin lax 
4a 


sin ax dx = — (2ax sin ax + 2 cos ax — a 2 x 2 cos ax) 


1 


x z cos axdx = ~ (lax cos ax — 2 sin ax + a 2 x 2 sin ax) 


/ 

/ 


sin ax sin &x dx 

sin ax cos bx dx 


sin (a - b)x sin (a + &)x 
2(a - &) 2(a + fc) 


cos (a - fr)x cos (a 4- &)x 1 7 

Ua-b) + 2(, + t) J “ ^ 
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/ 


, . sin (a — b)x sin (a + b)x 

cos ax cos ox ax — -——|- 

2 {a — b) 2(a ■+- b) 


I 

/ 

b 


e ax dx — 

a 


xe ax dx = — (a* — I) 


e ox dx = x~ — lax + 2) 

a* 


/ 

/ 


i >ax sin bx dx 


e iiX cos bx dx = 


+ b 2 


a 2 H- b 2 


(a sin bx — h cos fe) 


(a cos bx -b £ sin bx) 


/ 


1 


dx — - tan J - 

a a 

X 


* ~ dx — - In (x 2 + a 2 ) 

x l + a~ 2 


a - i^b~ 
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Adaptive delta modulation (ADM), 300 
Adaptive differential pulse code modulation 
(ADPCM), 294—295 

Additive white Gaussian noise (AWGN) ; 10, 
536 

Aliasing error (spectral folding), 121, 
259-261 

All-pass vs. distortionless system, 93-94 
Alternate mark inversion (AMI), 327, 
339-341 

Amplitude modulation (AM), 11, 83-84, 

140, 141-142, 151-158, 470-471 
band width-efficient, 158-167 
demodulation of, 156-158 
double-sideband, 142-151 
generation of, 156 
pulse (PAM), 267 
single-sideband (SSB), 159-160 
vestigial sideband (VSB), 167-170 
Amplitude shift keying (ASK), 373,581-584 
and AM modulation, connection between, 
376 

binary (BASK), 521 
detection, 377 
Analog signals, 4, 23-24 
Analog-to-digital (A/D) conversion, 6-7, 

251 

maximum information rate, 262-263 
nonideal practical sampling analysis, 
263-267 

signal reconstruction, 253-258 
aliasing error (spectral folding), 

259-261 

antialiasing filter, 261 


filters, realizability of, 258-259 
ideal, 254-255 
practical, 255-258 

Angle modulation, 141-142, 202, 204 
bandwidth of, 209-222 
features of, 229-231 
immunity of, 234-235 
narrowband, 210-211 
power of, 206-209 
Antheil, George, 623 
Antialiasing filter, 261 

vs. matched filter, 670-673 
Antipodal signals, 574 
Aperiodic signals, 24-25 

representation by Fourier integral, 

62-69 

conjugate symmetry property, 66 
Fourier transform, existence of, 

67-68 

Fourier transform, linearity of, 68 
Fourier transform, physical 
appreciation of, 68-69 
Aposteriori probability, 541, 749 
Arbitrarily small error probability, 

802 

Armstrong, Edwin H t , 220-222 
indirect methods of, 225-227 
Asymmetric digital subscriber line (ADSL), 
707-708 

Asynchronous channels, 287-288 
Autocorrelation function, 36, 109-111 
of power signals, 113-117 
Automatic gain control (AGC), 103 
Automatic repeat request (ARQ), 805 
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Balanced circuits, 147 
Balanced discriminator, 233 
Balanced modulators, 147 
double, 147 
single, 147 

Band-limited white Gaussian noise, 761 
Bandpass limiter, 223-224 
Bandpass matched filter, as coherent 
receiver, 521-522 
Bandpass random process, 491^99 
Bandpass signals, 85-87 
Bandwidth, 9 

of angle modulated waves, 209-222 
essential, 105-108, 122, 335 
of product of signals, 88 
Bandwidth-efficient amplitude modulations, 
158-167 

Baseband analog systems, performance 
analysis of, 486-488 
Baseband communications, 140-141 
Baseband signal, 2, 11, 140 
Basis functions, 39, 530-531 
Basis vectors, 37, 526 

Bayes’ decision rule, 399,424, 542, 578, 749 

Bayes receiver, 578-579 

Bayes’ theorem, 406-407 

BCJR MAP decoding algorithm, 846-850 

Bernoulli trials, 400-404 

Bessel function, modified zero-order, 498 

Best filter, 509 

Bezout identity, 686-687 

Binary 

amplitude shift keying (BASK), 521 
with eight-zero substitution (B8ZS), 343 
message, 4 

withN zero substitution (BNZS) 
signaling, 343 

phase shift keying (BPSK), 520-521 
polar signaling, optimum linear detector 
for, 506-512 
signaling, 512-520 

with six-zero substitution (B6ZS), 343 
symmetric channel (BSC), 745, 809 
systems, 516-520 

with three-zero substitution (B3ZS), 343 
threshold detection, 507-508 


Bipartite (Tanner) graph, 856-857 
Bipolar (pseudotemary) signaling, 327, 
339-341 

Bit (binary’ digit), 269 
Bit error rate (BER), 506, 515, 553, 823 
of orthogonal signaling, 566-567 
Bit-flipping LDPC decoding, 857 
Bit loading, 705-706 
Bit stuffing, 287-288 
Blind equalization, 711-712 
Block codes, 802 
linear, 806, 808 
Block interleaver, 839 
Bluetooth, 621-622 
Bose-Chaudhuri-Hocquenghen (BCH) 
codes, 822 

Bounded-input-bounded-output (BIBO) 
linear system, 90 
Broad OFDM applications, 709 
Burst error correction codes, 366, 826 
Butterworth filters, 97 

Carrier, 11, 83 

Carrier communications, 141 
OFDM, 692-702 
Carrier power, 155-156 
Carson’s rule, 213 

Cauchy-Schwartz inequality, 35n, 489, 509, 
875 

Causal signal, 27 

CCITT (Comity Consultatif International 
Teldphonique et Telegraphique), 

172 

cdmaOne (IS-95), 637,644, 645-646 
Cellular networks, 643-644 
Cellular systems, CDMA in, 644-645 
Central limit theorem 

for sample mean, 446-448 
for sum of independent random variables, 
448 
Channel, 3 
Channel bank, 289 

Channel capacity, 10, 745, 751, 753, 754, 
764 

of band-limited AWGN channel, 764-767 
of continuous memoryless channel, 756 
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of discrete memoryless channel, 748 
of infinite bandwidth, 767 
Channel diversity, 715 
Channel equalization, 669 
receiver, 670-676 
Channel estimation, 688-689 
Channel matrix, 749 
Channel shortening, 706 
Chase algorithms, 842-843 
Chebychev filters, 97 
Chebyshev’s inequality, 435-436 
Chrominance signals, 167 
Cochannel interference, 167 
Code-division multiple access (CDMA), 623 
in cellular phone networks, 643-647 
ofDSSS, 630-637, 643-649 
in GPS, 647-649 
power control in, 636-637 
Code efficiency, 742 
Code generator polynomial, 814 
Code rate, 802, 803 
Code tree, 828 

Coherent AM demodulation, 152 
Coherent receivers, for digital carrier 
modulations, 520-525 
Color burst, 167 

Common channel interoffice signaling 
(CCIS), 284 
Communication 
baseband, 140-141 
carrier, 141 
systems, 1^4 
Compact code, 741 
Compact disc, 270 
Compandor, 275,276-277 
Complement, 394 
Complete orthogonal basis, 37, 38 
Complete orthogonal set, 38,527 
Concatenated codes, 840-841 
Conditional densities, 424 
Conditional entropy, 749 
Conditional probability, 398^00 
multiplication rule for, 404 
of random variables, 410—412 


Conference on European Postal and 

Telegraph Administration (CEPT), 
284 

Constraint length, 827 
Continuous phase frequency shift keying 
(CPFSK), 524 

Continuous random variables, 409,413-416 
Continuous time signal, 23 
Convolution 
codes, 802, 827 
frequency, 87 
time, 87 

Convolution theorem, 87-88 
Coprime, 687 

Correlation coefficient, 34-36,436-439 
Correlation functions, 35-36 
Correlation receiver, 512 
Costas loop, 180 

Cross-correlation coefficients, 35, 561 
Cross-correlation function, 479 
Cross-power spectral density, 479-480 
Cumulative distribution function (CDF), 
412-413,458 
Cyclic codes, 813-822 
Cyclic linear block code theorem, 814 
Cyclic prefix, 694, 706 
Cyclic Redundancy Check (CRC) codes, 822 

D -dimension a I sphere, 769-772 
Decision feedback equalizer (DFE), 689-692 
error propagation in, 691-692 
Decision feedback MUD receiver, 642-643 
Decision regions and error probability, 
545-551 

Decorrelator MUD receiver, 640 
Deductive logic, 407 
Delta modulation (DM), 141, 295-300 
adaptive (ADM), 300 
comparison with PCM, 296-297 
overloading in, 297-299 
sigma-delta modulation, 299-300 
threshold of coding, 297-299 
Demodulation, 11,13, 140, 143-146, 202 
of AM signals, 156-158 
of DSB-SC signals, 151 
of FM signals, 231-234 
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Destination, 3 

Detection error probability, 365-366 
Detection signal space dimensionality, 
538-541 

Deterministic signals, 25-26, 393 
Differential encoding, 354-355, 379, 587 
Differential GPS, 648 
Differential phase shift keying (DPSK), 
378-380 

error probability, 587-589 
Differential pulse code modulation (DPCM), 
290-293 

analysis of, 292-293 
SNR improvement, 293 
Digital audio broadcasting (DAB), 709-711 
Digital broadcasting, 709 
Digital carrier systems 

analog and digital carrier modulations, 
connections between, 376-377 
binary carrier modulations, 372-374 
demodulation, 377-380 
PSD of, 374-376 

Digital communication systems, 326-329, 
666 

advantages of, 270-271 
line coder, 327-328 
multiplexer, 328 
performance analysis of, 506 
regenerative repeater, 328-329 
source, 326 

Digital data system (DDS), 289 
Digital data transmission, principles of, 326 
Digital multiplexing 
asynchronous channels and bit stuffing, 
287-288 

plesiochronous digital hierarchy, 288-290 
signal format, 285-287 
Digital signals, 4-8, 23-24 
at level 0 (DS0), 288 
at level 1(DS1), 282, 289 
distortionless regenerative repeaters and 
nodes, viability of, 5-6 
noise immunity of, 4-5 
pulse code modulation, 7-8 
Digit interleaving, 285 
Diode bridge modulator, 148 


Direct FM generation, 227-229 
Direct sequence spread spectrum (DSSS) 
against broadband jammers, 630 
CDMA of, 630-637, 643-651 
against narrowband jammers, 629-630 
PN sequence generation, 625-626 
PSK, optimum detection of, 624-625 
resilient features of, 628-630 
single-user, 626-628 
Discrete Fourier transform (DFT) 

FFT algorithm in, 123 
numerical computation of, 118-123 
points of discontinuity, 122 
Discrete multitone (DMT) modulation, 
702-706 

reaMife applications of, 707-711 
Discrete random variables, 408-410 
Discrete time signal, 23 
Dispersion, 98 
Distortion, 92, 97-103 

in audio and video signals, 94-95 
linear, 97-99 

due to multipaths, 101-102 
nonlinear, 99-101, 234-239 
Distortionless transmission, 92-95 
Distributed coordinator function (DCF), 651 
Doppler effect, 712 
Double balanced modulator, 147 
Double-sideband, suppressed-carrier 

(DSB-SC) modulation, 142, 143 
carrier acquisition in, 179-180 
nonlinear, 146-147 
signals, demodulation of, 151 
switching, 147-151 

Double-sideband amplitude modulation, 
142-151 

Downconversion, 151 
Duality 

property, 77-79 
time-frequency, 76-77 
Duobinary signaling, 351-352 
detection of, 353-355 
modified, 353 

Elastic store, 287 
Element, 394 
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Energy 

of modulated signals, 108-109 
signal, 20-21, 22, 25, 103-111 
scaler product and, 528-529 
of sum of orthogonal signals, 34 
Energy spectral density (ESD), 104-105 
of input and output, 111 
time autocorrelation function and, 
109-111 

Ensemble, of random process, 456, 459 
statistics, 459 
Entropy, 14, 737 
Envelope detection, 152-154 
condition for, 153-155 
Envelope detector, 157-158 
Equalizers 

decision feedback (DFE), 689-692 
feedforward (FFW), 690 
finite length MMSE, 681-682 
fractionally spaced (FSE), 684, 686-687, 
688 

linear, 689-690 
time domain (TEQ), 706 
zero-forcing, 359-362, 677-679 
Equivalent optimum binary receivers, 516 
Equivalent signal sets, 569-577 
Ergodic wide-sense stationary processes, 
463-465 

Error correction coding, 14-15 
Error-free communication, 745-748, 
754-755, 768-769 

Error probability of optimum receivers, 
561-569 

Error propagation, in DFE, 691-692 
Error vector, 29 

Essential bandwidth, 105-108, 122, 335 
Event, 394 

Exclusive OR (XOR), 805 
Experiment, 393-394 
Exponential Fourier series, 39-46 
Fourier spectra, 41^42 
negative frequency mean, 42-45 
ParsevaFs theorem in, 46 
Exponential Fourier spectra, 43^12 
Exponential modulation, 204 
Extended superframe (ESF), 284 


Eye diagrams, 366-369 
in PAM, 372 

Fading channels, 103 
flat, 714-715 

frequency-selective, 713-714 

conversion to flat fading channel, 715 
False alarm, 580 
False dismissal, 580 
Faraday, Michael, 16 
Fast Fourier transform (FFT) 

algorithm in DFT computations, 123 
Feedback decoding, 837 
Feedforward (FFW) equalizers, 690 
Filtering, 128 
Filters 

antialiasing, 261 
bandpass matched, 521-522 
best, 509 
Bultcrworth, 97 
first-order-hold, 258 
ideal vs* practical, 95-97 
matched, 509-512 
optimum receiver, 508-512 
reconstruction filters, realizability of, 
258-259 
VSB,168-169 

Finite length MMSE equalizers, 681-682 

First-order-hold filter, 258 

Flat fading channels, 714-715 

Folding frequency, 122 

Forward error correcting (FEC) codes, 802 

Fourier integral 

aperiodic signal representation by, 62-69 
Fourier series 
exponential, 39-46 
ParsevaFs theorem in, 46 
Fourier transform 
direct, 65 
discrete 

numerical computation of, 118-123 
duality property, 77-79 
existence of, 67-68 
frequency-shifting property, 77, 83-87 
inverse, 65 
linearity of, 68 
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Fourier transform (Continued) 
physical appreciation of, 68-69 
time differentiation property, 88-90 
time-frequency duality, 76-77 
time integration property, 88-90 
time-scaling property, 79-81 
time-shifting property, 81-82 
Fractionally spaced equalizers (FSE) 

MMSE design, 688 
SIMO model, 684-686 
ZF design, 686-687 
Frame, 283 
Framing bit, 283 

Frequency converter (mixer), 150-151 
Frequency convolution, 87 
Frequency counters, 233 
Frequency demodulators, practical, 232-233 
Frequency division multiplexing (FDM), 13, 
141, 172-173 

Frequency bopping spread spectrum 
(FHSS), 614-617 
applications of, 621-624 
asynchronous, 619-620 
performance, with multiple user access, 
618-619 

Frequency modulation (FM), 11, 202, 204 
broadcasting system, 241-242 
narrowband (NBFM), 210, 211 
and phase modulation, relationship 
between, 205-206 
signals, demodulation of, 231-234 
tone, 214-217 
waves, generating, 222-231 
wideband (WBFM), 211-213 
Frequency multipliers, 225 
Frequency resolution, 122 
Frequency shift keying (FSK), 208, 373, 
522-523, 584-586 
detection, 377-378 

and FM modulation, connection between, 
376 

Gaussian, 622 

M -ary FSK and orthogonal signaling, 
380-382 

Frequency-selective channels, 669, 776-780 


Frequency-selective fading channel, 102, 
713-714 

conversion to flat fading channel, 715 
Frequency-shifting property, 77, 83-87 
Full-cosine roll-off characteristics, 349 

Gaussian approximation, of nonorthogonal 
MAI, 633-634 

Gaussian random process properties, 
534-536 

Gaussian random variable, 416-422 
sum of, 444^-46 
Generalized angle, 203 
Generalized Fourier series, 39 
Generator matrix, 806, 818-819 
Generator polynomial, 818-819 
code, 814 

Geometric interpretation, in signal space, 
546—551 

Geometrical signal space, 525-527 
Global Positioning System (GPS), 647 
differential, 648 
operation, 647-648 
spread spectrum in, 648-649 
Gram-Schmidt orthogonalization, 530, 531, 
877-879 
Gray code, 553 
Group (envelope) delay, 94n 
GSM cellular phones, 675 

Hadamard inequality, 787 
Hamming bound, 804 
Hamming codes, 804, 812-813 
Hamming distance, 746, 807 
Hard-decision decoding, 841 
Hermitian symmetry, 66n 
High-definition television (HDTV), 309-310 
High-density bipolar (HDB) signaling, 342 
High-quality LP vocoders, 304 
High-speed packet access (HSPA), 647 
Hilbert transform, 160-161 
Hold-in (lock) range, 178 
Homodyne modulators, 151 
Huffman code, 740-745,789-791 
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Ideal vs. practical filters, 95-97 
IEEE 802. II, 621 -622, 649-651 
Image stations, 240 
Impulse noise, 366 

Independence vs. uncorrelatedness, 439 
Independent events, 400 
Independent random variables, 425 
variance of sum of, 434-435 
Indirect FM generation, 225-227 
Inductive logic, 407 
Information 

commonsense measure of, 734-735 
engineering measure of, 735-737 
entropy of a source, 737-739 
In-phase component, 491 
Input transducer, 2 
Instantaneous frequency, 203-204 
Instantaneous velocity, 203 
Integrated services digital network (ISDN), 
371 

Interleaved code, 839-840 
Interleaving depth, 839 
Interleaving effect, 241 
International Mobile 

Telecommunications-2000 standard 
(IMT-2000), 646 

International Telecommunications Union 
known, 172 

Interpolation, 253, 255-258 
ideal, 254—255 
Intersection, 395 

Intersymbol interference (ISI), 98, 343-344, 
669 

Jitter, 364, 365 

Joint distribution, 422^424 

Joint event, 395 

Joint source-channel coding, 15 

Justification, 287 

Karhunen-Loeve expansion, 531, 577 

Lamarr, Hedy, 623 
L-ary digital signal, 269 
Levinson-Durbin algorithm, 302 


Linear distortion, 3, 97-99 
Linear equalizers, 689-690 
Linear mean square estimation, 

440-443 

Linear prediction coding (LPC) vocoders, 
301-304 

high-quality LP vocoders, 304 
LPC-10 vocoder, 303 
models, 302-304 

voice models and model-based vocoders, 
301-302 
Linear system 

signal transmission through, 90-95 
distortion, in audio and video signals, 
94-95 

distortionless transmission, 92-95 
signal distortion, 92 

Linear time-in variant (LTI) continuous time 
system, 90, 111 
frequency response of, 91-92 
Line coder, 327-328 
Line coding, 327, 329-343 
bipolar (pseudotemary or AMI) signaling, 
339-341 

BNZS signaling, 343 
HDB signaling, 342 
on-off signaling, 337-339 
polar signaling, 334-336 
power spectral density, 330-334 
constructing dc null in, 336-337 
properties of, 329-330 
Line spectral pairs (LSP), 303 
Local carrier synchronization, 170-172 
Logarithmic units, 280-281 
Low-density parity check (LDPC) codes, 
854—860 

bipartite (tanner) graph, 856-857 
bit-flipping decoding, 857 
decoding, sum-product algorithm for, 
858-860 

LPC-10 vocoder, 303 

Manchester (split-phase) signaling, 

337 

Marginal densities, 423 
Marginal probabilities, 411 
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M -ary 

ASK and noncoherent detection, 380 
bandwidth and power trade offs, 554, 
567-568 

binary polar signaling, 551-554 
FSK and orthogonal signaling, 380-382 
message, 4, 5 

PAM signaling, 369-372, 383-384, 385 
QAM analysis, 384-385, 554-560 
trading power and bandwidth, 385-386 
Matched filter, 509-512 
MATLAB exercises 

for AM modulation and demodulation, 
185-188 

basic signals and signal graphing, 46—54 
coefficients D tty numerical computation 
of, 52-54 

for delta modulation, 317-319 
digital communication systems, 715 
performance analysis of, 589 
for DSB-SC modulation and 
demodulation, 181-185 
for error correcting code, 861 
for eye diagrams, 386-387 
for FM modulation and demodulation, 
242-246 

for Fourier transform computation, 
123-130 

for information theory, 789 
lowpass signals, sampling and 
reconstruction of, 310-313 
for PCM, 313-317 

periodic signals and signal power, 48—49 
for QAM modulation and demodulation, 
191-195 

signal correlation, 49-52 
for spread spectrum technologies, 651 
for SSB-SC modulation and 
demodulation, 188-191 
Matrix product and properties, 880 
Maximum a posterior (MAP) detection, 
BCJR algorithm for, 846 
Maximum a posteriori probability (MAP) 
detector, 542 

Maximum capacity power loading, 777-779 


Maximum information rate, in digital 
communication, 262-263 
Maximum length shift register sequences, 
625 

Maximum likelihood decoding, 809, 
831-834 

Maximum likelihood receiver, 579-580, 
639-640 

Maximum likelihood sequence estimation 
(MLSE), 673-676 

complexity and practical implementations, 
675-676 
Mean, 427-428 

of function of random variable, 428—429 
of product of two functions, 430 
of sum, 429 
Measure-zero set, 39n 
Memoryless source, 737 
Message signal, 2 
Minimax receiver, 580-581 
Minimum energy signal set, 572-575 
Minimum mean square error (MMSE), 363, 
679 

finite data design, 683-684 

finite length MMSE equalizers, 681-682 

FSE design, 688 

and optimum delay, 681 

receiver, 640-642 

vs + ZF, 682-683 

Minimum shift keying (MSK), 523-525 
Minimum weight vector, 810 
Mobile telephone switching office (MTSO), 
644 

Modem, 372 

Modem telecommunications, historical 
review of, 15-19 

Modified duobinary signaling, 353 
Modulated signal, 83 
phase spectrum, shifting, 84-85 
power spectral density of, 118 
Modulating signal, 83 
Modulation 

amplitude (AM), 11, 83-84, 140, 141-142 
angle, 141-142, 202, 204 
application of, 84-85 
delta (DM), 141,295-300 
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discrete multitone (DMT), 702 
dou bl e- si deband, su ppressed - c arrier 
(DSB-SC), 142, 143 
double-sideband amplitude, 142-151 
frequency (FM), 11, 202, 204, 205-206, 
210-217 
index, 213 
nonlinear, 202-209 
phase (PM), 11, 204, 205-206, 210, 
213-214 

putse amplitude, 141 
pulse code (PCM), 7-8, 141,229, 
402-403, 774-776 
pulse position (PPM), 141,267, 268 
pulse width (PWM), 141,267, 268 
single-sideband (SSB), 159-160 
tone, 144 

vestigial sideband (VSB), 167-170 
Modulators 
balanced, 147 
coherent, 151 
diode bridge, 148 
double balanced, 147 
frequency, 232-233 
homodyne, 151 
multiplier, 146 
nonlinear, 146-147 
single balanced, 147 
switching, 147-151 
synchronous, 151 

Moments of random variables, 430 
central, 430 
Morse code, 4, 14 

Moving Picture Experts Group (MPEG), 
301, 304-309 
MPSK signals, 557-560 
Multiamplitude signaling, 551-554 
Multicarrier communication system, 699 
Multipath transmission, 101-102 
Multiple-input-multiple-output (MIMO), 
715 

channel capacity, 781-783, 794-796 
transmitter with channel knowledge, 
785-789 

transmitter without channel knowledge, 
783-785 


Multiplexer, 328 
Multiplexing, 12-13 
digital, 285-290 

frequency division (FDM), 13, 141, 
172-173 

time division (TDM), 13, 267, 268 
T1 time division, 281-282 
Multiplication rule, for conditional 
probabilities, 404 
Multiplier modulators, 146 
Multitone signaling (MFSK), 564-566 
noncoherent, 586-587 
Multiuser detecion (MUD), 637-643 
decision feedback receiver, 642-643 
decorrelator receiver, 640 
MMSE receiver, 640-642 
optimum, 639-640 
vs* power control, 647 
Mutual information, 750, 762-764 
channel capacity and, 791-794 
Mutually exclusive (disjoint) events, 395 

Narrowband angle modulation, 210-211 
Narrowband frequency modulation (NBFM), 
210,211 

generation, 222-223 
Narrowband modulation, 210-211 
Narrowband phase modulation (NBPM), 210 
Natural binary code (NBC), 269 
Near-far problem, 635-636 
Near-far resistance, 637 
Nodes, 5-6 
Noise, 3, 14 

Noisy channel coding theorem, 802 
Noncoherent detection, 581-589 
Nonideal practical sampling analysis, 
263-267 

Nonlinear distortion, 3, 99-101 
Nonlinear DSB-SC modulators, 146-147 
Nonlinear modulation, 202-209 
Non-return-to-zero (NRZ) pulses, 328 
Nonwhite channel noise, 577 
Null event, 394 

Nyquist criteria for zero IS1, 344-349 
Nyquist interval, 253 
Nyquist sampling rate, 253 
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Offset QPSK (OQPSK), 645 
On-off keying (OOK), 373 
On-off signaling, 327, 337-339, 518-519 
Optimum delay, 681 
Optimum filter, 483-486 
Optimum linear precoder, 789 
Optimum linear receiver analysis, 512-516 
Optimum MUD receiver, 639 
Optimum power distribution, 704 
Optimum power loading 
in OFDM/DMT, 780 
water-pouring interpretation of, 

779-780 

Optimum preemphasis-deemphasis systems, 
488-491 

Optimum receiver 
filter, 508-512 

for white Gaussian noise channels, 536 
Optimum threshold, 513-515 
Orthogonal frequency division modulation 
(OFDM) 

channel equalization and, 669, 701-702 
channel noise, 698-700 
cyclic prefix redundancy in, 701 
principles of, 692-698 
real-life applications of, 707-711 
zero-padded, 700-701 
Orthogonality 

complex signal space and, 32-33 
of exponential signal set, 874 
of trigonometric signal set, 873 
Orthogonal signaling, 519-520, 562-564, 
776 

bandwidth and power trade-offs of M -ary, 
567-568 

energy of sum of, 34 
Orthogonal signal sets, 36-39 
Orthogonal signal space, 38-39 
Orthogonal vectors, 30, 526-527 
Orthogonal vector space, 36-38 
Orthonormal basis set, 38, 529-530 
Orthonormal vectors, 527 
Outcomes, 394 
Output transducer, 3 
Overhead bits, 285 


Paley-Wiener criterion, 96, 259 
Parity check digits, 806 
Parity check matrix, 808 
Parseval's theorem, 39, 103-104 
in Fourier series, 46 

Partial reflection coefficients (PARCOR), 
303 

Partial response signaling, 350-351 
Perfect code, 804 
Periodic signals, 24-25 
Phase coherent (in phase) lock, 178 
Phase delay, 94n 

Phase-locked loop (PLL), 172, 173-181, 
233-234 

basic operation, 174-175 
first-order loop analysis, 177-178 
hold-in (lock) range, 178 
phase coherent (in phase) lock, 178 
pull-in (capture) range, 178 
small-error analysis, 175-177 
Phase modulation (PM), 11, 204, 213-214 
and frequency modulation, relationship 
between, 205-206 
narrowband (NBPM), 210 
Phase shift keying (PSK), 208, 373 
binary (BPSK), 520-521 
detection, 378 

differential (DPSK), 378-380 
differentially coherent, 587-589 
and QAM modulation, connection 
between, 376-377 
Phase shift method, 163 
Piconet, 621 

Plain-old-telephone-service (POTS), 707 
PSesiochronous digital hierarchy, 288-290 
Polar signaling, 334-336, 516-518 
Power 

of angle modulated wave, 206-209 
control, 636-637 
vs. MUD, 647 
loading, 704 
signal, 25, 111-118 

Power spectral density (PSD), 111-112, 
330-334, 465,486 
of amplitude shift keying, 374-376 
constructing dc null in, 336—337 
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of digital carrier modulation, 374-376 
of frequency shift keying, 375-376 
input and output, 117-118 
interpretation of, 114 
of modulated signals, 118 
of phase shift keying, 375-376 
Prediction coefficients, 292 
Probability, 393—408 
axiomatic theory of, 407-408 
Probability density function (PDF), 414,458 
Product code, 840 
Progressive taxation, 274-278 
Pseudonoise (PN) sequence generation, 
625-626 

Public switched telephone network (PSTN), 
707 

Pull-in (capture) range, 178 
Pulse amplitude modulation (PAM), 141, 
267,268,551-554 

M -ary baseband signaling, for higher 
order rate, 369-372 

Pulse code modulation (PCM), 7-8, 141, 
229, 267, 268-281,774-776 
channel noise, mean square value of, 
432-434 

differential (DPCM), 290-293 
encoder, 278 

quantization error, mean square value of, 
431—432 

repeater error probability, 402-403 
in T1 carrier systems, 281-284 
total mean square error in, 435 
transmission bandwidth and output SNR, 
278-281 

Pulse detection error, 271 
Pulse generation, 355 

Pulse position modulation (PPM), 141, 267, 
268 

Pulse shaping, 336-337, 343-355 
controlled ISI or partial response 
signaling, 350-351 
differential encoding, 354-355 
duobinary pulse, 351-352, 353-355 
intersymbol interferences, 343-344 
Nyquist’s criteria for zero 1ST, 344-349 
in PAM, 371 


zero-ISI, duobinary, and modified 
duobinary, pulse relationship 
between, 352-353 
Pulse stuffing, 287-288 
Pulse width modulation (PWM), 141, 267, 
268 

QCELP (Qualcomm code-excited linear 
prediction) vocoder, 645 
Quadrature, 491 

nonuniqueness of representation, 495-496 
Quadrature amplitude modulation (QAM), 
159, 165-167 
M- ary, 384-385,554-560 
and PSK, connection between, 376-377 
Quadrature multiplexing, 165, 167 
Quantization, 7, 269, 271-273 
error, 271 
noise, 272 

nonuniform, 274-278 

Radiation, 11-12 
Raised cosine characteristics, 349 
Random interleaver, 840 
Randomness, 14 
Random processes, 393,456 
autocorrelation function of, 459-461 
bandpass, 491—199 

baseband analog systems, performance 
analysis of, 486-488 
basic functions determination for, 

530-531 
binary, 471-473 
characterization of, 458-459 
ergodic wide-sense stationary processes, 
463-465 

Gaussian properties, 534-536 
independent process, 479 
multiple, 479-480 
optimum filter, 483—486 
optimum preemphasis-deemphasis 
systems, 488—191 
orthogonal process, 479 
PAM pulse train, 473—478 
power of, 468 
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Random processes {Continued) 
power spectral density, 465, 486 
stationary, 461 
sum of, 481-483 

transmission of, through linear systems, 

480-483 

uncorrelated process, 479 
wide-sense stationary process, 461 —4<>3 
Random variables 

conditional probabilities of, 410-412 
continuous, 409, 413^16 
discrete, 408^410 
Gaussian, 416-422 
independent, 425, 434^435 
sum of, 443^446 
Ratio detector, 233 
Rayleigh density, 425^127 
Receiver, 3 

Rectifier detector, 156-157 
Recursive systematic convolutional (RSC) 
code, 831, 850-851 
Redundancy, 13, 14, 742 
Reed-Solomon codes, 822 
Regenerative repeater, 5-6, 328-329, 
358-359 

Relative frequency, 395-398 
Relative likelihood, 842 
Resource exchange, 10-11 
Return-to-zero (RZ) pulses, 328 
Rice density, 499 
Ring modulator, 148, 149 
Robbed-bit signaling, 284 
Roll-off factor, 348 
Root-raised-cosine pulse, 674 
Row-column (RC) constraint, 857 

Sample function, of random process, 456 
Sample space, 394 
Sampling theorem, 6, 251-253 
applications of, 267-268 
Scalar product and signal energy, 528-529 
Scatter diagram, 437,599 
Scrambling, 355-358 
Selective-filtering method, 163-164 
Sequential decoding, 834-837 
Series bridge diode modulator, 148, 149 


Shannon’s equation, 10, 773 
Shunt bridge diode modulator, 148, 149 
Sideband, 155-156 
Sigma-delta modulation, 299-300 
Signal distortion, 92 
in audio and video signals, 94-95 
Signal energy, 20-21,22, 103-111 
scalar product and, 528-529 
Signal power, 9-10, 111-118 
time autocorrelation of, 113-H7 
Signal reconstruction, 253-258 

aliasing error {spectral folding), 259-261 
antialiasing filter, 261 
filters, realizability of, 258-259 
ideal, 254-255 
practical, 255-258 
Signals 

bandpass, 85-87 
energy, 20-21, 22 
power, 21 
size of, 20-22 
vs, vectors, 28-34 
Signals, classification of, 22-26 
analog, 23-24 
aperiodic, 24-25 
continuous time, 23 
deterministic, 25-26 
digital, 23-24 
discrete time, 23 
energy, 25 
periodic, 24-25 
power, 25 
probabilistic, 25 
Signals, correlation of, 34-36 
correlation functions, 35-36 
Signal space 

analysis of optimum detection, 525-530 
and basis signals, 527-530 
Signal-squaring method, 179-180 
Signal-to-noise ratio (SNR), 9, 511 
in DPCM, 293 

exchange with bandwidth, 875 
in PCM, 278-281 




Index 901 


transmitter power loading for maximizing 
receiver, 703-704 
Simplex signal set, 575-577 
Simplified signal space and decision 
procedure, 541-545 
Sine function, 70-72 
SINCGARS, 622-623 
Single balanced modulator, 147 
Si ngle-input-multipie-output (S1MO) 
model, 684-685 

S ingle- s ideband, suppressed-c arrier 
(SSB-SC) modulation 
carrier acquisition in, 180-181 
Single-sideband (SSB) modulation, 159-160 
signals with carrier (SSB+C), 165 
systems, 163-164 
phase shift method, 163 
selective-filtering method, 163-164 
Weaver's method, 164 
time domain representation of, 161-163 
Sinusoidal carrier, 141 
Sinusoidal signal, in noise, 498-499 
Slope detection, 232, 233 
Slope overload, 298 
Slotted frequency hopping, 619 
SNR improvement, 483 
Soft decoding, 841-843 
Soft-output Viterbi algorithm (SOVA), 
844—845 
Source, 2 
Source coding, 13 
randomness and, 14 
redundancy and, 14 
Source encoding, 739-745 
Spectral density, 69 
energy (BSD), 104-105, 109-111 
power (PSD), 111-112, 114, 117-118, 
336-337, 330-334, 374-376 
Spectrum, 69 

direct sequence spread (DSSS), 624-637, 
643-651 

frequency hopping spread (FHSS), 
614-624 
phase, 84-85 

spread spectrum, in GPS, 614, 648-649 
vestigial, 348 


Standard deviation, 430 
State transition diagram, 829-830 
Stationary random process, 461 
Stationary white noise, 531 
Successive interference cancellation (SIC) 
receiver, 642 

Sum-product algorithm, for LDPC decoding, 
858-860 
Super frame, 284 
extended (ESF), 284 
Superheterodyne receiver, 239-240 
Superposition theorem, 68 
Switching modulators, for DSB-SC, 

147-151 

Synchronization, 283-284 
Synchronous detection, 144 
Synchronous modulators, 151 
Syndrome, 809, 837 
Systematic code, 806 
Systemic cyclic codes, 816-818 
Systems, 20 
BIBO linear, 90 
binary, 516-520 
cellular, 644-645 
communication, 1-4 
digital carrier, 372-380 
digital communication, 326-329, 506, 
666,715 

FM broadcasting, 241-242 
global positioning (GPS), 647-649 
linear, 90-95 

multicarrier communication, 699 
T1 carrier, 281-284 

Thermal noise, 480-481 
3G cellular services, 646-647 
Threshold detection, 420^422 
binary, 507-508 

Time autocorrelation function, 109-111 
of power signals, 113-117 
Time convolution, 87 
Time differentiation property, 88-90 
Time division multi pie-access (TDMA) 
systems, 287 
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Time division multiplexing (TDM), 13, 267, 
268 

Time domain equalizer (TEQ), 706 
Time-frequency duality, 76-77 
Time integration property, 88-90 
Time scaling 

signal duration, reciprocity of, 80-81 
significance of, 79-80 
Time shifting, 81-82 

linear phase, physical explanation of, 
81-82 

Time-varying channels, 712-715 
Timing extraction, 363-364 
Timing jitter, 364, 365 
Toeplitz matrix, 361 
T1 carrier systems, 281-284 
synchronizing and signaling, 

283-284 

time division multiplexing, 281-282 
Tone frequency modulation 
spectra! analysis of, 214—217 
Tone modulation, 144 
T1 multiplexer, 289 
T1 time division multiplexing, 

281-282 

Total probability theorem, 404-406 
Transmission, 11-12, 90-95 
digital data, 326 
distortionless, 92-95 
multipath, 101-102 
Transmitter, 2, 703-704 
MIMO, 783-789 
Trellis diagram, 830, 837-838 
T-spaced equalization (TSE), 671, 

676-684 

based on MMSE, 679 
zero-forcing equalizer, 677-679 
Turbo codes, 846-854 

Uncorrelated variables, 438 
mean square of sum of, 439 
Undetermined multipliers, 759 
Union, 394-395 
Unitary matrix, 883 
Unit impulse function, 26 
multiplication of, 26-27 


sampling property of, 27 
Unit impulse signal, 26-28 
Unit rectangular function, 70 
Unit step function, 27-28 
Unit triangular function, 70 
Upconversion, 151 

Variance, 430 

of sum of independent random variables, 
434-435 

Vector decomposition of white noise random 
processes, 530-536 

Vectors 

vs. signals, 28-34 

complex signal space and orthogonality, 
32-33 

component of vector along another 
vector, 28-30 

decomposition of signal and signal 
components, 30-32 
Vector space, 28 

Vestigial sideband (VSB) modulation, 
167-170 

in broadcast television, 169-170 
filter, 168-169 

signals with carrier (VSB+C), 168-169 
Vestigial spectrum, 348 
Video compression, 300, 304-309 
Viterbi algorithm, 831-834 
Vocoders, 300 

linear prediction coding, 301-304 
Voltage-controlled oscillator (VCO), 174 

Weaver's method of SSB generation, 164 
Weiner-Hopf filter. See Optimum filter 
White channel noise, 531-532 
White Gaussian noise, 515-516, 533-534, 
544 

White noise, 531-532 

additive white Gaussian noise (AWGN), 
536 

Gaussian, 515-516, 533-534, 536, 544 
geometrical representation of, 531-533 
stationary, 531 
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vector decomposition of random 
processes, 530-536 

Wideband frequency modulation (WBFM), 
211-213 

Wide-sense stationary process, 461^63 
Wiener-Khintchine theorem, 468 
Wireless multipath channels, linear 
distortions of, 666-669 


Word interleaving, 285 

Zero-crossing detectors, 233 
Zero-forcing (ZF) equalizer, 359-362, 
677-679 

FSE design, 686-687 
Zero padding, 120, 700 




